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A method for achieving site specific integration of a desired DNA at a target site in a mammalian cell via homologous recombination 
is described. This method provides for the reproducible selection of cell lines wherein a desired DNA is integrated at a predetermined 
transcriptionally active site previously marked with a marker plasmid. The method is particularly suitable for the production of mammalian 
cell lines which secrete mammalian proteins at high levels, in particular immunoglobulins. Vectors and vector combinations for use in the 
subject cloning method are also provided. 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


AM 


Armenia 


FI 


AT 


Austria 


FR 


AU 


Australia 


GA 


AZ 


Azerbaijan 


GB 


BA 


Bosnia and Herzegovina 


GE 


BB 


Barbados 


GH 


BE 


Belgium 


GN 


BF 


Burkina Faso 


GR 


BG 


Bulgaria 


HU 


BJ 


Benin 


IE 


BR 


Brazil 


IL 


BY 


Belarus 


IS 


CA 


Canada 


IT 


CF 


Central African Republic 


JP 


CG 


Congo 


KE 


CH 


Switzerland 


KG 


CI 


C6te d'lvoire 


KP 


CM 


Cameroon 




CN 


China 


KR 


CU 


Cuba 


KZ 


CZ 


Czech Republic 


LC 


DE 


Germany 


LI 


DK 


Denmark 


LK 


EE 


Estonia 


LR 



Spain 
Finland 
France 
Gabon 

United Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Israel 

Iceland 

Italy 

Japan 

Kenya 

KyTgyzstan 

Democratic People's 

Republic of Korea 

Republic of Korea 

Kazakstan 

Saint Lucia 

Liechtenstein 

Sri Lanka 

Liberia 



LS Lesotho 

LT Lithuania 

LU Luxembourg 

LV Latvia 

MC Monaco 

MD Republic of Moldova 

MG Madagascar 

MK The fanner Yugoslav 

Republic of Macedonia 

ML Mali 

MN Mongolia 

MR Mauritania 

MW Malawi 

MX Mexico 

NE Niger 

NL Netherlands 

NO Norway 

NZ New Zealand 

PL Poland 

PT Portugal 

RO Romania 

RU Russian Federation 

SD Sudan 

SE Sweden 

SG Singapore 



SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


sz 


Swaziland 


TO 


Chad 


TG 


Togo 


TJ 


Tajikistan 


TM 


Turkmenistan 


TR 


Turkey 


TT 


Trinidad and Tobago 


UA 


Ukraine 


UG 


Uganda 


US 


United States of America 


uz 


Uzbekistan 


VN 


Viet Nam 


YU 


Yugoslavia 


ZW 


Zimbabwe 



WO 98/41645 



PCT/US98/03935 



10 



15 



20 



7\r.lt* of T,ftH invention 

MFTHOD FOR INTEGRATING GENES AT SPECIFIC SITES IN MAMMALIAN CELLS VIA 
MNOLOGOUS RECOMBINATION AND VECTORS FOR ACCOMPLISHING THE SAME 



ri»1, ri of «-h« Tpvpnr.ion 

The present invention relates to a process of tar- 
geting the integration of a desired exogenous DNA to a 
specific location within the genome of a mammalian cell. 
More specifically, the invention describes a novel meth- 
od for identifying a transcriptionally active target 
site ("hot spot") in the mammalian genome, and inserting 
a desired DNA at this site via homologous recombination. 
The invention also optionally pcuvidca ^ a-""i 
gene amplification of the desired DNA at this location 
by co- integrating an amplifiable selectable marker, 
e.g., DHFR, in combination with the exogenous DNA. The 
invention additionally describes the construction of 
novel vectors suitable for accomplishing the above, and 
further provides mammalian cell lines produced by such 
methods which contain a desired exogenous DNA integrated 
at a target hot spot . 
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Background 

Technology for expressing recombinant proteins in 
both prokaryotic and eukaryotic organisms is well estab- 
lished. Mammalian cells offer significant advantages 
5 over bacteria or yeast for protein production, resulting 
from their ability to correctly assemble, glycosylate 
and post-translationally modify recombinant ly expressed 
proteins. After transfection into the host cells, 
recombinant expression constructs can be maintained as 

10 extrachromosomal elements, or may be integrated into the 
host cell genome. Generation of stably transfected 
mammalian cell lines usually involves the latter; a DNA 
construct encoding a gene of interest along with a drug 
resistance gene (dominant selectable marker) is intro- 

15 duced into the host cell, and subsequent growth in the 
presence of the drug allows for the selection of cells 
that have successfully integrated the exogenous DNA. In 
many instances, the gene of interest is linked to a drug 
resistant selectable marker which can later be subjected 

20 to gene amplification. The gene encoding dihydrof olate 
reductase (DHFR) is most commonly used for this purpose. 
Growth of cells in the presence of methotrexate, a com- 
petitive inhibitor of DHFR, leads to increased DHFR 
production by means of amplification of the DHFR gene. 

25 As flanking regions of DNA will also become amplified, 
the resultant coamplif ication of a DHFR linked gene in 
the transfected cell line can lead to increased protein 
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production, thereby resulting in high level expression 
of the gene of interest. 

While this approach has proven successful, there 
are a number of problems with the system because of the 
5 random nature of the integration event. These problems 
exist because expression levels are greatly influenced 
by the effects of the local genetic environment at the 
gene locus, a phenomena well documented in the litera- 
ture and generally referred to as "position effects': 
10 (for example, see Al-Shawi et al, Mol. Cell. Biol,, 

10:1192-1198 (1990); Yoshimura et al , Mol. Cell. Biol., 
7:1296-1299 (1987)). As the vast majority of mammalian 
DNA is in a transcriptionally inactive state, random 
integration methods offer no control over the 
15 transcriptional fate of the integrated DNA. 

Consequently, wide variations in the expression level 
of integrated genes can occur, depending on the site of 
integration. For example, integration of exogenous DNA 
into inactive, or transcriptionally "silent" regions of 
20 the genome will result in little or no expression. By 
contrast integration into a transcriptionally active 
site may result in high expression. 

Therefore, when the goal of the work is to obtain a 
high level of gene expression, as is typically the de- 
25 sired outcome of genetic engineering methods, it is 

generally necessary to screen large numbers of transfec- 
tants to find such a high producing clone. 
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Additionally, random integration of exogenous DNA into 
the genome can in some instances disrupt important 
cellular genes, resulting in an altered phenotype . 
These factors can make the generation of high expressing 
stable mammalian cell lines a complicated and laborious 
process . 

Recently, our laboratory has described the use of 
DNA vectors containing translationally impaired dominant 
selectable markers in mammalian gene expression. (This 
is disclosed in U.S. Serial No. 08/147,696 filed Novem- 
ber 3, 1993, recently allowed). 

These vectors contain a translationally impaired 
neomycin phosphotransferase (neo) gene as the dominant 
selectable marker, artificially engineered to contain an 
intron into which a DHFR gene along with a gene or genes 
of interest is inserted. Use of these vectors as ex- 
pression constructs has been found to significantly 
reduce the total number of drug resistant colonies pro- 
duced, thereby facilitating the screening procedure in 
relation to conventional mammalian expression vectors. 
Furthermore, a significant percentage of the clones 
obtained using this system are high expressing clones. 
These results are apparently attributable to the 
modifications made to the neo selectable marker. Due to 
25 the translational impairment of the neo gene, 

transfected cells will not produce enough neo protein to 
survive drug selection, thereby decreasing the overall 



15 



20 
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number of drug resistant colonies. Additionally, a 
higher percentage of the surviving clones will contain 
the expression vector integrated into sites in the 
genome where basal transcription levels are high, 
5 resulting in overproduction of neo, thereby allowing the 
cells to overcome the impairment of the neo gene. 
Concomitantly, the genes of interest linked to neo will 
be subject to similar elevated levels of transcription. 
This same advantage is also true as' a result of the 
10 artificial intron created within neo; survival is 

dependent on the synthesis of a functional neo gene, 
which is in turn dependent on correct and efficient 
splicing of the neo introns . Moreover, these criteria 
are more likely to be met if the vector DNA has 
15 integrated into a region which is already highly 
transcriptionally active. 

Following integration of the vector into a tran- 
scriptionally active region, gene amplification is per- 
formed by selection for the DHFR gene. Using this sys- 
20 tern, it has been possible to obtain clones selected 

using low levels of methotrexate <50nM) , containing few 
(<10) copies of the vector which secrete high levels of 
protein ( >55pg/cell/day) . Furthermore, this can be 
achieved in a relatively short period of time. However, 
25 the success in amplification is variable. Some 

transcriptionally active sites cannot be amplified and 
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therefore the frequency and extent of amplification from 
a particular site is not predictable. 

Overall, the use of these translationally impaired 
vectors represents a significant improvement over other 
methods of random integration. However, as discussed, 
the problem of lack of control over the integration site 
remains a significant concern. 

One approach to overcome the problems of random 
integration is by means of gene targeting, whereby the 
exogenous DNA is directed to a specific locus within the 
host genome. The exogenous DNA is inserted by means of 
homologous recombination occurring between sequences of 
DNA in the expression vector and the corresponding ho- 
mologous sequence in the genome. However, while this 
15 type of recombination occurs at a high frequency natu- 
rally in yeast and other fungal organisms, in higher 
eukaryotic organisms it is an extremely rare event. In 
mammalian cells, the frequency of homologous versus non- 
homologous (random integration) recombination is report - 
20 ed to range from 1/100 to 1/5000 (for example, see 
Capecchi, Science, 244:1288-1292 (1989); Morrow and 
Kucherlapati, Curr. Op. Biotech., 4:577-582 (1993)). 

One of the earliest reports describing homologous 
recombination in mammalian cells comprised an artificial 
25 system created in mouse fibroblasts (Thomas et al , Cell, 
44:419-428 (1986)). A cell line containing a mutated, 
non- functional version of the neo gene integrated into 
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the host genome was created, and subsequently targeted 
with a second non- functional copy of neo containing a 
different mutation. Reconstruction of a functional neo 
gene could occur only by gene targeting. Homologous 
recombinants were identified by selecting for G418 
resistant cells, and confirmed by analysis of genomic 
DNA isolated from the resistant clones. 

Recently, the use of homologous recombination to 
replace the heavy and light immunoglobulin genes at 
endogenous loci in antibody secreting cells has been 
reported. (U.S. Patent No. 5,202,238, Fell et al, 
(1993).) However, this particular approach is not 
widely applicable, because it is limited to the 
production of immunoglobulins in cells which 
endogenously express immunoglobulins, e.g., B cells and 
myeloma cells. Also, expression is limited to single 
copy gene levels because co-amplification after 
homologous recombination is not included. The method is 
further complicated by the fact that two separate 
20 integration events are required to produce a functional 
immunoglobulin: one for the light chain gene followed by 
one for the heavy chain gene. 

An additional example of this type of system has 
been reported in NS/0 cells, where recombinant 
25 immunoglobulins are expressed by homologous 

recombination into the immunoglobulin gamma 2A locus 
(Hollis et al, international patent application # 



15 



WO 98/41645 



PCT/US98/03935 



- 8 - 



PCT/IB95 (00014).) Expression levels obtained from this 
site were extremely high - on the order of 20pg/cell/day 
from a single copy integrant. However, as in the above 
example, expression is limited to this level because an 
5 amplifiable gene is not contegrated in this system. 
Also, other researchers have reported aberrant 
glycosylation of recombinant proteins expressed in NS/0 
cells (for example, see Flesher et al, Biotech, and 
Bioeng., 48:399-407 (1995)), thereby limiting the _ 
10 applicability of this approach. 

The cre-loxP recombination system from 
bacteriophage PI has recently been adapted and used as a 
means of gene targeting in eukaryotic cells. 
Specifically, the site specific integration of exogenous 
15 DNA into the Chinese hamster ovary (CHO) cell genome 
using ere recombinase and a series of lox containing 
vectors have been described. (Fukushige and Sauer, 
Proc. Natl. Acad. Sci. USA, 89:7905-7909 (1992).) This 
system is attractive in that it provides for 
2 0 reproducible expression at the same chromosomal 

location. However, no effort was made to identify a 
chromosomal site from which gene expression is optimal, 
and as in the above example, expression is limited to 
single copy levels in this system. Also, it is 
25 complicated by the fact that one needs to provide for 
expression of a functional recombinase enzyme in the 
mammalian cell. 
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The use of homologous recombination between an 
introduced DNA sequence and its endogenous chromosomal 
locus has also been reported to provide a useful means 
of genetic manipulation in mammalian cells, as well as 
5 in yeast cells. (See e.g., Bradley et al, Meth. 
Enzymol., 223:855-879 (1993); Capecchi , Science, 
244:1288-1292 (1989); Rothstein et al, Meth. Enzymol., 
194:281-301 (1991)). To date, most mammalian gene 
targeting studies have been directed toward gene 

10 disruption ("knockout") or site-specific mutagenesis of 
selected target gene loci in mouse embryonic stem (ES) 
cells. The creation of these "knockout" mouse models 
has enabled scientists to examine specific 
structure -function issues and examine the biological 

15 importance of a myriad of mouse genes. This field of 
research also has important implications in terms of 
potential gene therapy applications. 

Also, vectors have recently been reported by Cell- 
tech (Kent, U.K.) which purportedly are targeted to 

20 transcriptionally active sites in NSO cells, which do 
not require gene amplification (Peakman et al, Hum. 
Antibod. Hybridomas, 5:65-74 (1994)). However, levels 
of immunoglobulin secretion in these unamplified cells 
have not been reported to exceed 20pg/cell/day , while in 

25 amplified CHO cells, levels as high as lOOpg/cell/day 
can be obtained ( Id. ) . 
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It would be highly desirable to develop a gene 
targeting system which reproducibly provided for the 
integration of exogenous DNA into a predetermined site 
in the genome known to be transcriptionally active. 
5 Also, it would be desirable if such a gene targeting 

system would further facilitate co-amplification of the 
inserted DNA after integration. The design of such a 
system would allow for the reproducible and high level 
expression of any cloned gene of interest in a mammalian 

10 cell, and undoubtedly would be of significant interest 
to many researchers. 

In this application, we provide a novel mammalian 
expression system, based on homologous recombination 
occurring between two artificial substrates contained in 

15 two different vectors. Specifically, this system uses a 
combination of two novel mammalian expression vectors, 
referred to as a "marking" vector and a "targeting" 
vector . 

Essentially, the marking vector enables the identi- 
20 fication and marking of a site in the mammalian genome 

which is transcriptionally active, i.e., a site at which 
gene expression levels are high. This site can be 
regarded as a "hot spot" in the genome. After integra- 
tion of the marking vector, the subject expression sys- 
25 tern enables another DNA to be integrated at this site, 
i.e., the targeting vector, by means of homologous 
recombination occurring between DNA sequences common to 
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both vectors. This system affords significant 
advantages over other homologous recombination systems. 

Unlike most other homologous systems employed in 
mammalian cells, this system exhibits no background. 
5 Therefore, cells which have only undergone random inte- 
gration of the vector do not survive the selection. 
Thus, any gene of interest cloned into the targeting 
plasmid is expressed at high levels from the marked hot 
spot. Accordingly, the subject method of gene expres- 

10 sion substantially or completely eliminates the problems 
inherent to systems of random integration, discussed in 
detail above. Moreover, this system provides reproduc- 
ible and high level expression of any recombinant pro- 
tein at the same transcriptionally active site in the 

15 mammalian genome. In addition, gene amplification may 
be effected at this particular transcriptionally active 
site by including an amplifiable dominant selectable 
marker {e.g. DHFR) as part of the marking vector. 

Objects of the Invention 

20 Thus, it is an object of the invention to provide 

an improved method for targeting a desired DNA to a 
specific site in a mammalian cell. 

It is a more specific object of the invention to 
provide a novel method for targeting a desired DNA to a 

25 specific site in a mammalian cell via homologous recom- 
bination. 
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It is another specific object of the invention to 
provide novel vectors for achieving site specific inte- 
gration of a desired DNA in a mammalian cell. 

It is still another object of the invention to 
5 provide novel mammalian cell lines which contain a de- 
sired DNA integrated at a predetermined site which pro- 
vides for high expression. 

It is a more specific object of the invention to 
provide a novel method for achieving site specific inte- 
LO gration of a desired DNA in a Chinese hamster ovary 
(CHO) cell. 

It is another more specific object of the invention 
to provide a novel method for integrating immunoglobulin 
genes, or any other genes, in mammalian cells at 
15 predetermined chromosomal sites that provide for high 
expression. 

It is another specific object of the invention to 
provide novel vectors and vector combinations suitable 
for integrating immunoglobulin genes into mammalian 
20 cells at predetermined sites that provide for high ex- 
pression . 

It is another object of the invention to provide 
mammalian cell lines which contain immunoglobulin genes 
integrated at predetermined sites that provide for high 

25 expression. 

It is an even more specific object of the invention 
to provide a novel method for integrating immunoglobulin 
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genes into CHO cells that provide for high expression, 
as well as novel vectors and vector combinations that 
provide for such integration of immunoglobulin genes 
into CHO cells. 

5 In addition, it is a specific object of the inven- 

tion to provide novel CHO cell lines which contain immu- 
noglobulin genes integrated at predetermined sites that 
provide for high expression, and have been amplified by 
methotrexate selection to secrete even greater amounts 
10 of functional immunoglobulins. 

p r i, pf Descri ption of the Figures 

Figure 1 depicts a map of a marking plasmid accord- 
ing to the invention referred to as Desmond. The plas- 
mid is shown in circular form (la) as well as a 

15 linearized version used for transfection (lb) . 

Figure 2(a) shows a map of a targeting plasmid 
referred to "Molly" . Molly is shown here encoding the 
anti-CD20 immunoglobulin genes, expression of which is 
described in Example 1. 

20 Figure 2(b) shows a linearized version of Molly, 

after digestion with the restriction enzymes Kpnl and 
Pacl. This linearized form was used for transfection. 

Figure 3 depicts the potential alignment between 
Desmond sequences integrated into the CHO genome, and 

25 incoming targeting Molly sequences. One potential ar- 
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rangement of Molly integrated into Desmond after homolo- 
gous recombination is also presented. 

Figure 4 shows a Southern analysis of single copy 
Desmond clones. Samples are as follows: 
5 Lane 1: XHindlll DNA size marker 
Lane 2 : Desmond clone 10F3 
Lane 3 : Desmond clone 10C12 
Lane 4 : Desmond clone 15C9 
Lane 5: Desmond clone 14B5 
10 Lane 6 : Desmond clone 9B2 

Figure 5 shows a Northern analysis of single copy 
Desmond clones. Samples are as follows: Panel A: 
northern probed with CAD and DHFR probes, as indicated 
on the figure. Panel B : duplicate northern, probed with 
15 CAD and HisD probes, as indicated. The RNA samples 
loaded in panels A and B are as follows: 
Lane 1: clone 9B2, lane 2; clone 10C12, lane 3; clone 
14B5, lane 4; clone 15C9, lane 5; control RNA from CHO 
transfected with a HisD and DHFR containing plasmid, 
lane 6; untransf ected CHO. 

Figure 6 shows a Southern analysis of clones 
resulting from the homologous integration of Molly into 
Desmond. Samples are as follows: 

Lane 1: XHindlll DNA size markers, Lane 2: 20F4, lane 3 
5F9, lane 4; 21C7, lane 5; 24G2, lane 6; 25E1, lane 7; 
28C9, lane 8; 2 9F9, lane 9; 39G11, lane 10; 42F9, lane 
11; 50G10, lane 12; Molly plasmid DNA, linearized with 



20 



25 
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BgllKtop band) and cut with Bglll and Kpnl (lower 
band), lane 13; untransf ected Desmond. 

Figures 7A through 7G contain the Sequence Listing 
for Desmond. 

5 Figures 8A through 81 contain the Sequence Listing 

for Molly-containing anti-CD20. 

Figure 9 contains a map of the targeting plasmid, 
"Mandy, " shown here encoding anti-CD23 genes, the 
expression of which is disclosed in Example 5. 
10 Figures 10A through ION contain the sequence 

listing of "Mandy" containing the anti-CD23 genes as 
disclosed in Example 5. 

Detailed Description of the Invention 

The invention provides a novel method for integrat- 

15 ing a desired exogenous DNA at a target site within the 
genome of a mammalian cell via homologous recombination. 
Also, the invention provides novel vectors for achieving 
the site specific integration of a DNA at a target site 
in the genome of a mammalian cell. 

20 More specifically, the subject cloning method pro- 

vides for site specific integration of a desired DNA in 
a mammalian cell by transfection of such cell with a 
"marker plasmid" which contains a unique sequence that 
is foreign to the mammalian cell genome and which 

25 provides a substrate for homologous recombination, fol- 
lowed by transfection with a "target plasmid" containing 
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a sequence which provides for homologous recombination 
with the unique sequence contained in the marker 
plasmid, and further comprising a desired DNA that is to 
be integrated into the mammalian cell. Typically, the 
5 integrated DNA will encode a protein of interest, such 
as an immunoglobulin or other secreted mammalian 

g ly copro t e i n . 

The exemplified homologous recombination system 
uses the neomycin phosphotransferase gene as a dominant 
10 selectable marker. This particular marker was utilized 
based on the following previously published observa- 
tions; 

(i) the demonstrated ability to target and restore 
,- - ™,,«- = i-o^ vprsinn of the neo qene (cited 

15 earlier) and 

(ii) our development of translationally impaired 
expression vectors, in which the neo gene has been arti- 
ficially created as two exons with a gene of interest 
inserted in the intervening intron; neo exons are cor- 

20 rectly spliced and translated in vivo, producing a func- 
tional protein and thereby conferring G418 resistance on 
the resultant cell population. In this application, the 
neo gene is split into three exons. The third exon of 
neo is present on the "marker" plasmid and becomes inte- 

25 grated into the host cell genome upon integration of the 
marker plasmid into the mammalian cells. Exons 1 and 2 
are present on the targeting plasmid, and are separated 
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by an intervening intron into which at least one gene of 
interest is cloned. Homologous recombination of the 
targeting vector with the integrated marking vector 
results in correct splicing of all three exons of the 
neo gene and thereby expression of a functional neo 
protein (as determined by selection for G418 resistant 
colonies) . Prior to designing the current expression 
system, we had experimentally tested the functionality 
of such a triply spliced neo construct in mammalian 
cells. The results of this control experiment indicated 
that all three neo exons were properly spliced and 
therefore suggested the feasibility of the subject 
invention. 

nvfinf.ion is exemplified 

nuwc v c: IT , wiIa-lc ^ — 

15 using the neo gene, and more specifically a triple split 
neo gene, the general methodology should be efficacious 
with other dominant selectable markers. 

As discussed in greater detail infra, the present 
invention affords numerous advantages to conventional 

20 gene expression methods, including both random integra- 
tion and gene targeting methods. Specifically, the 
subject invention provides a method which reproducibly 
allows for site-specific integration of a desired DNA 
into a transcriptionally active domain of a mammalian 

25 cell. Moreover, because the subject method introduces 
an artificial region of "homology" which acts as a 
unique substrate for homologous recombination and the 
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insertion of a desired DNA, the efficacy of subject 
invention does not require that the cell endogenously 
contain or express a specific DNA. Thus, the method is 
generically applicable to all mammalian cells, and can 
5 be used to express any type of recombinant protein. 

The use of a triply spliced selectable marker, - 
e.g., the exemplified triply spliced neo construct, 
guarantees that all G418 resistant colonies produced 
will arise from a homologous recombination event (random 
10 integrants will not produce a functional neo gene and 
consequently will not survive G418 selection) . Thus, 
the subject invention makes it easy to screen for the 
desired homologous event. Furthermore, the frequency of 

... - i—, a >.t nn c in a roi l t-hat" hss under- 
additional ranaom mtcyi.ai.j.ui.s - ■ — 

15 gone a homologous recombination event appears to be low. 

Based on the foregoing, it is apparent that a sig- 
nificant advantage of the invention is that it substan- 
tially reduces the number of colonies that need be 
screened to identify high producer clones, i.e., cell 

20 lines containing a desired DNA which secrete the corre- 
sponding protein at high levels. On average, clones 
containing integrated desired DNA may be identified by 
screening about 5 to 20 colonies (compared to several 
thousand which must be screened when using standard 

25 random integration techniques, or several hundred using 
the previously described intronic insertion vectors) 
Additionally, as the site of integration was preselected 
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and comprises a transcriptionally active domain, all 
exogenous DNA expressed at this site should produce 
comparable, i.e. high levels of the protein of interest. 
Moreover, the subject invention is further advanta- 

5 geous in that it enables an amplifiable gene to be 

inserted on integration of the marking vector. Thus, 
when a desired gene is targeted to this site via 
homologous recombination, the subject invention allows 
for expression of the gene to be further enhanced by 

10 gene amplification. In this regard, it has been 

reported in from the literature that different genomic 
sites have different capacities for gene amplification 
(Meinkoth et al, Mol . Cell Biol., 7:1415-1424 (1987)). 
Therefore, this technique is further advantageous as it 

15 allows for the placement of a desired gene of interest 

at a specific site that is both transcriptionally active 
and easily amplified. Therefore, this should signifi- 
cantly reduce the amount of time required to isolate 
such high producers . 

20 Specifically, while conventional methods for the 

construction of high expressing mammalian cell lines can 
take 6 to 9 months, the present invention allows for 
such clones to be isolated on average after only about 
3-6 months. This is due to the fact that conventionally 

25 isolated clones typically must be subjected to at least 
three rounds of drug resistant gene amplification in 
order to reach satisfactory levels of gene expression. 
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As the homologously produced clones are generated from a 
preselected site which is a high expression site, fewer 
rounds of amplification should be required before reach- 
ing a satisfactory level of production. 

Still further, the subject invention enables the 
reproducible selection of high producer clones wherein 
the vector is integrated at low copy number, typically 
single copy. This is advantageous as it enhances the 
stability of the clones and avoids other potential ad- 
verse side-effects associated with high copy number. As 
described supra, the subject homologous recombination 
system uses the combination of a "marker plasmid" and a 
"targeting plasmid" which are described in more detail 
below. 

The "marker plasmid" which is used to mark and 
identify a transcriptionally hot spot will comprise at 
least the following sequences: 

(i) a region of DNA that is heterologous or unique 
to the genome of the mammalian cell, which functions as 
a source of homology, allows for homologous recombina- 
tion (with a DNA contained in a second target plasmid) . 
More specifically, the unique region of DNA (i) will 
generally comprise a bacterial, viral, yeast synthetic, 
or other DNA which is not normally present in the 
25 mammalian cell genome and which further does not 

comprise significant homology or sequence identity to 
DNA contained in the genome of the mammalian cell. 



15 



20 
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Essentially, this sequence should be sufficiently 
different to mammalian DNA that it will not signifi- 
cantly recombine with the host cell genome via 
homologous recombination. The size of such unique DNA 
5 will generally be at least about 2 to 10 kilobases in 

size, or higher, more preferably at least about lOkb. as 
several other investigators have noted an increased 
frequency of targeted recombination as the size of the 
homology region is increased (Capecchi, Science, 

10 244:1288-1292 (1989) ) - 

The upper size limit of the unique DNA which acts 
as a site for homologous recombination with a sequence 
in the second target vector is largely dictated by po- 

. , . , ^ngtrainc.s (if DNA is too large it 
tentiax scacin^ • 

15 may not be easily integrated into a chromosome and the 
difficulties in working with very large DNAs. 

(ii) a DNA including a fragment of a selectable 
marker DNA, typically an exon of a dominant selectable 
marker gene. The only essential feature of this DNA is 
20 that it not encode a functional selectable marker pro- 
tein unless it is expressed in association with a se- 
quence contained in the target plasmid. Typically, the 
target plasmid will comprise the remaining exons of the 
dominant selectable marker gene (those not comprised in 
25 -targeting" plasmid) . Essentially, a functional 

selectable marker should only be produced if homologous 
recombination occurs (resulting in the association and 
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expression of this marker DNA (i) sequence together with 
the portion (s) of the selectable marker DNA fragment 
which is (are) contained in the target plasmid) . 

As noted, the current invention exemplifies the 
use of the neomycin phosphotransferase gene as the domi- 
nant selectable marker which is -split- in the two vec- 
tors. However, other selectable markers should also be 
suitable, e.g., the Salmonella histidinol dehydrogenase 
gene, hygromycin phosphotransferase gene, herpes simplex 
virus thymidine kinase gene, adenosine deaminase gene, 
glutamine synthetase gene and hypoxanthine -guanine 
phosphoribosyl transferase gene. 

(iii) a DNA which encodes a functional selectable 
. . r -^-i„ ,.,v,-i^h electable marker is different 
from the selectable marker DNA (ii) . This selectable 
marker provides for the successful selection of mammali- 
an cells wherein the marker plasmid is successfully 
integrated into the cellular DNA. More preferably, it 
is desirable that the marker plasmid comprise two such 
dominant selectable marker DNAs, situated at opposite 
ends of the vector. This is advantageous as it enables 
integrants to be selected using different selection 
agents and further enables cells which contain the en- 
tire vector to be selected. Additionally, one marker 
25 can be an amplifiable marker to facilitate gene 

amplification as discussed previously. Any of the 



15 



20 
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dominant selectable marker listed in (ii) can be used as 
well as others generally known in the art. 

Moreover, the marker plasmid may optionally further 
comprise a rare endonuclease restriction site. This is 
5 potentially desirable as this may facilitate cleavage. 

If present, such rare restriction site should be situat- 
ed close to the middle of the unique region that acts as 
a substrate for homologous recombination. Preferably 
such sequence will be at least about 12 nucleotides. 

10 The introduction of a double stranded break by similar 
methodology has been reported to enhance the frequency 
of homologous recombination. (Choulika et al, Mol . 
Cell. Biol., 15:1968-1973 (1995)). However, the 
presence of such sequence is not essential. 

X5 The "targeting plasmid" will comprise at least the 

following sequences: 

(1) the same unique region of DNA contained in the 
marker plasmid or one having sufficient homology or 
sequence identity therewith that said DNA is capable of 

2 0 combining via homologous recombination with the unique 
region (i) in the marker plasmid. Suitable types of 
DNAs are described supra in the description of the 
unique region of DNA (1) in the marker plasmid. 

(2) The remaining exons of the dominant selectable 
25 marker, one exon of which is included as (ii) in the 

marker plasmid listed above. The essential features of 
this DNA fragment is that it result in a functional 
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(selectable) marker protein only if the target plasmid 
integrates via homologous recombination (wherein such 
recombination results in the association of this DNA 
with the other fragment of the selectable marker DNA 
5 contained in the marker plasmid) and further that it 

allow for insertion of a desired exogenous DNA. Typi- 
cally, this DNA will comprise the remaining exons of the 
selectable marker DNA which are separated by an intron. 
For example, this DNA may comprise the first two exons 

10 of the neo gene and the marker plasmid may comprise the 
third exon (back third of neo) . 

(3) The target plasmid will also comprise a de- 
sired DNA, e.g., one encoding a desired polypeptide, 
preferably inserted within the selectable marker DNA 

15 fragment contained in the plasmid. Typically, the DNA 

will be inserted in an intron which is comprised between 
the exons of the selectable marker DNA. This ensures 
that the desired DNA is also integrated if homologous 
recombination of the target plasmid and the marker plas- 

20 mid occurs. This intron may be naturally occurring or 

it may be engineered into the dominant selectable marker 
DNA fragment . 

This DNA will encode any desired protein, 
preferably one having pharmaceutical or other desirable 

25 properties. Most typically the DNA will encode a 

mammalian protein, and in the current examples provided, 
an immunoglobulin or an immunoadhesin . However the 
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invention is not in any way limited to the production of 
immunoglobulins . 

As discussed previously, the subject cloning method 
is suitable for any mammalian cell as it does not re- 
5 quire for efficacy that any specific mammalian sequence 
or sequences be present. In general, such mammalian 
cells will comprise those typically used for protein 
expression, e.g., CHO cells, myeloma cells, COS cells, 
BHK cells, Sp2/0 cells, NIH 3T3 and HeLa cells. In the 

10 examples which follow, CHO cells were utilized. The 

advantages thereof include the availability of suitable 
growth medium, their ability to grow efficiently and to 
high density in culture, and their ability to express 
mammalian proteins such as immunoglobulins in biologi- 

15 cally active form. 

Further, CHO cells were selected in large part 
because of previous usage of such cells by the inventors 
for the expression of immunoglobulins (using the trans- 
lationally impaired dominant selectable marker contain- 

2 0 ing vectors described previously) . Thus, the present 
laboratory has considerable experience in using such 
cells for expression. However, based on the examples 
which follow, it is reasonable to expect similar results 
will be obtained with other mammalian cells. 

25 In general, transformation or transfection of mam- 

malian cells according to the subject invention will be 
effected according to conventional methods. So that the 
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invention may be better understood, the construction of 
exemplary vectors and their usage in producing inte- 
grants is described in the examples below. 

EXAMPLE 1 

5 Design and Preparation of Marker 

and Targeting Plasmid D NA Vectors 

The marker plasmid herein referred to as "Desmond" 

was assembled from the following DNA elements: 

(a) Murine dihvdrof olate reductase g ene (DHFR) . 
10 incorporated into a transcription cassette, comprising 

the mouse beta globin promoter 5" to the DHFR start 
site, and bovine growth hormone poly adenylation signal 
3 M to the stop codon. The DHFR transcriptional cassette 
was isolated from TCAE6, an expression vector created 
15 previously in this laboratory (Newman et al , 1992, Bio- 
technology, 10:1455-1460) . 

(b) g. coli S-galactosidase gene - commercially 
available, obtained from Promega as pSV-b-galactosidase 
control vector, catalog # E1081. 

20 (c) Baculovirus DNA. commercially available, pur- 

chased from Clontech as pBAKPAK8, cat # 6145-1. 

(d) Cassette comprising promoter and enhancer ele- 
ments from Cytomegalovirus and SV40 virus. The cassette 
was generated by PCR using a derivative of expression 

25 vector TCAE8 (Reff et al, Blood, 83:435-445 (1994)). 

The enhancer cassette was inserted within the baculo- 
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virus sequence, which was first modified by the inser- 
tion of a multiple cloning site. 

(e) E. coli GUS (glucuronida se) gene , commercially 
available, purchased from Clontech as pBlOl, cat. # 

5 6017-1. 

(f) Firgfly luciferase gene, commercially avail- 
able, obtained from Promega as pGEM-Luc (catalog # 
E1541) . 

(g) S. typhimurium histidinol dehydroge nase gene 
10 (HisD) . This gene was originally a gift from (Donahue 

et el, Gene, 18:47-59 (1982)), and has subsequently been 
incorporated into a transcription cassette comprising 
the mouse beta globin major promoter 5' to the gene, and 
the SV4 0 polyadenylation signal 3' to the gene. 

15 The DNA elements described in (a) - (g) were combined 

into a pBR derived plasmid backbone to produce a 7 . 7kb 
contiguous stretch of DNA referred to in the attached 
figures as "homology" . Homology in this sense refers to 
sequences of DNA which are not part of the mammalian 

2 0 genome and are used to promote homologous recombination 
between transfected plasmids sharing the same homology 
DNA sequences. 

(h) Neomycin phosphotransferase gene from TN5 (Da- 
vis and Smith, Ann. Rev. Micro., 32:469-518 (1978)). 

25 The complete neo gene was subcloned into pBluescript 

SK- (Stratagene catalog # 212205) to facilitate genetic 
manipulation. A synthetic linker was then inserted into 
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a unique Pstl site occurring across the codons for amino 
acid 51 and 52 of neo . This linker encoded the neces- 
sary DNA elements to create an artificial splice donor 
site, intervening intron and splice acceptor site within 
5 the neo gene, thus creating two separate exons, present- 
ly referred to as neo exon 1 and 2. Neo exon 1 encodes 
the first 51 amino acids of neo, while exon 2 encodes 
the remaining 2 03 amino acids plus the stop codon of the 
protein A No tl cloning site was also created within .the 

10 intron. 

Neo exon 2 was further subdivided to produce neo 
exons 2 and 3. This was achieved as follows: A set of 
PCR primers were designed to amplify a region of DNA 
encoding neo exon 1, intron and the first 111 2/3 amino 

15 acids of exon2 . The 3 ■ PCR primer resulted in the 

introduction of a new 5* splice site immediately after 
the second nucleotide of the codon for amino acid 111 in 
exon 2, therefore generating a new smaller exon 2. The 
DNA fragment now encoding the original exon 1, intron 

2 0 and new exon 2 was then subcloned and propagated in a 

pBR based vector. The remainder of the original exon 2 
was used as a template for another round of PCR 
amplification, which generated n exon3". The 5' primer 
for this round of amplification introduced a new splice 

25 acceptor site at the 5' side of the newly created exon 
3, i.e. before the final nucleotide of the codon for 
amino acid 111. The resultant 3 exons of neo encode the 
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following information: exon 1 - the first 51 amino acids 
of neo; exon 2 - the next 111 2/3 amino acids, and exon 
3 the final 91 1/3 amino acids plus the translational 
stop codon of the neo gene. 

Neo exon 3 was incorporated along with the above 
mentioned DNA elements into the marking plasmid 
"Desmond" . Neo exons 1 and 2 were incorporated into the 
targeting plasmid "Molly". The Wotl cloning site creat- 
ed within the intron between exons 1 and 2 was used in 
subsequent cloning steps to insert genes of interest 
into the targeting plasmid. 

A second targeting plasmid "Mandy" was also 
generated. This plasmid is almost identical to "Molly" 
(some restriction sites on the vector have been changed) 
15 except that the original HisD and DHFR genes contained 
in "Molly" were inactivated. These changes were 
incorporated because the Desmond cell line was no longer 
being cultured in the presence of Histidinol, therefore 
it seemed unnecessary to include a second copy of the 
20 HisD gene. Additionally, the DHFR gene was inactivated 
to ensure that only a single DHFR gene, namely the one 
present in the Desmond marked site, would be amplifiable 
in any resulting cell lines. "Mandy" was derived from 
"Molly" by the following modifications: 
25 (i) A synthetic linker was inserted in the middle 

of the DHFR coding region. This linker created a stop 
codon and shifted the remainder of the DHFR coding 
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region out of frame, therefore rendering the gene 

nonfunctional . 

(ii) A portion of the HisD gene was deleted and 
replaced with a PCR generated HisD fragment lacking the 
; promoter and start codon of the gene. 

Figure 1 depicts the arrangement of these DMA ele- 
ments in the marker plasmid "Desmond" . Figure 2 depicts 
the arrangement of these elements in the first targeting 
plasmid, "Molly" . Figure 3 illustrates the possible 
0 arrangement in the CHO genome, of the various DNA 

elements after targeting and integration of Molly DNA 
into Desmond marked CHO cells. Figure 9 depicts the 
targeting plasmid "Mandy." 

_ ^^r-—-*-" ~* ^h* markina and targeting plasmids 
5 from the above listed DNA elements was carried out fol- 
lowing conventional cloning techniques (see, e.g., 
Molecular Cloning, A Laboratory Manual, J. Sambrook et 
al, 1987, Cold Spring Harbor Laboratory Press, and 
Current Protocols in Molecular Biology, F. M. Ausubel et 
20 al, eds., 1987, John Wiley and Sons) . All plasmids were 
propagated and maintained in E. coli XLI blue 
(Stratagene, cat. * 200236). Large scale plasmid 
preparations were prepared using Promega Wizard Maxiprep 
DNA Purification System®, according to the 
25 manufacturer's directions. 
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EXAMPLE 2 

^ n «. | TT „M n n of a Marked CHO CeU Une 
1. Cell Culture and Tranaf ection Procedures to 
Produced Marked CHO Cell Line 

Marker plasmid DNA was linearized by digestion 
overnight at 37°C with Bstll07I. Linearized vector was 
ethanol precipitated and resuspended in sterile TE to a 
concentration of Img/ml. Linearized vector was intro- 
duced into DHFR-Chinese hamster ovary cells {CHO cells) 
DG44 cells (Urlaub et al, Som. Cell and Mol . Gen., 
12:555-566 (1986)) by electroporation as follows. 

Exponentially growing cells were harvested by cen- 
trifugation, washed once in ice cold SBS (sucrose 
buffered solution, 272mM sucrose, 7mM sodium phosphate, 
pH 7.4, lmM magnesium chloride) then resuspended in SBS 
to a concentration of 10' cells/ml. After a 15 minute 
incubation on ice, 0.4ml of the cell suspension was 
mixed with 4 0 M g linearized DNA in a disposable 
electroporation cuvette. Cells were shocked using a BTX 
electrocell manipulator (San Diego, CA) set at 230 
volts, 400 microfaraday capacitance, 13 ohm resistance. 
Shocked cells were then mixed with 20 ml of prewarmed 
CHO growth media (CHO-S-SFMII , Gibco/BRL, catalog # 
31033-012) and plated in 96 well tissue culture plates. 
Forty eight hours after electroporation, plates were fed 
with selection media (in the case of transf ection with 
Desmond, selection media is CHO-S-SFMII without 
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hypoxanthine or thymidine, supplemented with 2mM 
Histidinol (Sigma catalog # H6647) ) . Plates were main- 
tained in selection media for up to 3 0 days, or until 
some of the wells exhibited cell growth. These cells 
5 were then removed from the 96 well plates and expanded 
ultimately to 120 ml spinner flasks where they were 
maintained in selection media at all times. 

EXAMPLE 3 

Characterizat ion of Marked CHO Cell Lines 

0 (a) Southern Analysis 

Genomic DNA was isolated from all stably growing 
Desmond marked CHO cells. DNA was isolated using the 
Invitrogen Easy® DNA kit, according to the manufactur- 
er's directions. Genomic DNA was then digested with 

5 Hindlll overnight at 37°C, and subjected to Southern 
analysis using a PCR generated digoxygenin labelled 
probe specific to the DHFR gene. Hybridizations and 
washes were carried out using Boehringer Mannheim's DIG 
easy hyb (catalog # 1603 558) and DIG Wash and Block 

0 Buffer Set (catalog # 1585 762) according to the manu- 
facturer's directions. DNA samples containing a single 
band hybridizing to the DHFR probe were assumed to be 
Desmond clones arising from a single cell which had 
integrated a single copy of the plasmid. These clones 

:5 were retained for further analysis. Out of a total of 
45 HisD resistant cell lines isolated, only 5 were 
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single copy integrants. Figure 4 shows a Southern blot 
containing all 5 of these single copy Desmond clones. 
Clone names are provided in the figure legend, 
(b) Northern Analysis 
5 Total RNA was isolated from all single copy Desmond 

clones using TRIzol reagent (Gibco/BRL cat # 15596-026) 
according to the manufacturer's directions. 10-20Mg RNA 
from each clone was analyzed on duplicate formaldehyde 
gels. The resulting blots were probed with PCR 

10 generated digoxygenin labelled DNA probes to (i) DHFR 
message, (ii) HisD message and (iii) CAD message. CAD 
is a trifunctional protein involved in uridine 
biosynthesis (Wahl et al , J. Biol. Chem. , 254, 17:8679- 
8689 (1979)), and is expressed equally in all cell 

15 types. It is used here as an internal control to help 
quant itate RNA loading. Hybridizations and washes were 
carried out using the above mentioned Boehringer 
Mannheim reagents. The results of the Northern analysis 
are shown in Figure 5. The single copy Desmond ■ clone 

2 0 exhibiting the highest levels of both the His D and DHFR 
message is clone 15C9, shown in lane 4 in both panels of 
the figure. This clone was designated as the "marked 
cell line" and used in future targeting experiments in 
CHO, examples of which are presented in the following 

25 sections. 
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EXAMPLE 4 

n TO r «««ion Anti-CD20 Antibody 
ir^ Desmon d Marked CHO Cells 

C2B8, a chimeric antibody which recognizes B-cell 
5 surface antigen CD20, has been cloned and expressed 
previously in our laboratory. (Reff et al , Blood, 
83:434-45 (1994))- A 4.1 kb DNA fragment comprising the 
C2B8 light and heavy chain genes, along with the neces- 
sary regulatory elements (eukaryotic promoter and poly- 
0 adenylation signals) was inserted into the artificial 
intron created between exons 1 and 2 of the neo gene 
contained in a pBR derived cloning vector. This newly 
generated 5kb DNA fragment (comprising neo exon 1, C2B8 
and neo exon 2) was excised and used to assemble the 
targeting plasmid Molly. The other DNA elements used in 
the construction of Molly are identical to those used to 
construct the marking plasmid Desmond, identified 
previously. A complete map of Molly is shown in Fig. 2. 

The targeting vector Molly was linearized prior to 
transfection by digestion with Kpnl and Pad, ethanol 
precipitated and resuspended in sterile TE to a concen- 
tration of 1.5mg/mL. Linearized plasmid was introduced 
into exponentially growing Desmond marked cells essen- 
tially as described, except that 80/^g DNA was used in 
each electroporation. Forty eight hours postelectropo- 
ration, 96 well plates were supplemented with selection 
medium - CHO-SSFMII supplemented with 400 Atg/mL Geneti- 
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cin (G418, Gibco/BRL catalog # 10131-019). Plates were 
maintained in selection medium for up to 3 0 days, or 
until cell growth occurred in some of the wells. Such 
growth was assumed to be the result of clonal expansion 
5 of a single G418 resistant cell. The supernatants from 
all G418 resistant wells were assayed for C2B8 pro- 
duction by standard ELISA techniques, and all productive 
clones were eventually expanded to 12 0mL spinner flasks 
and further analyzed. 

10 Characterization of Antibo dy secreting Targeted Cells 

A total of 50 electroporations with Molly targeting 
plasmid were carried out in this experiment, each of 
which was plated into separate 96 well plates. A total 
of 10 viable, anti-CD20 antibody secreting clones were 

15 obtained and expanded to 120ml spinner flasks. Genomic 
DNA was isolated from all clones, and Southern analyses 
were subsequently performed to determine whether the 
clones represented single homologous recombination 
events or whether additional random integrations had 

20 occurred in the same cells. The methods for DNA isola- 
tion and Southern hybridization were as described in the 
previous section. Genomic DNA was digested with EcoRI 
and probed with a PCR generated digoxygenin labelled 
probe to a segment of the CD2 0 heavy chain constant 

25 region. The results of this Southern analysis are pre- 
sented in figure 6. As can be seen in the figure, 8 of 
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the 10 clones show a single band hybridizing to the CD20 
probe, indicating a single homologous recombination 
event has occurred in these cells. Two of the ten, 
clones 24G2 and 28C9, show the presence of additional 
5 band(s), indicative of an additional random integration 
elsewhere in the genome. 

We examined the expression levels of anti-CD2 0 
antibody in all ten of these clones, the data for which 
is shown in Table 1, below. 

10 Table 1: 

Expression Level of Anti-CD20 
Secreting Homologous Integrants 

Clone Anti-CD20, pg/c/d 



20F4 


3.5 


25E1 


2.4 


42F9 


1.8 


39G11 


1.5 


21C7 


1.3 


50G10 


0.9 


29F9 


0.8 


5F9 


0.3 


28C9* 


4.5 


24G2* 


2 . 1 
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* These clones contained additional randomly- 
integrated copies of anti-CD20. Expression 
levels of these clones therefore reflect a 
contribution from both the homologous and ran- 
dom sites. 

Expression levels are reported as picogram per cell per 
day (pg/c/d) secreted by the individual clones, and 
represented the mean levels obtained from three separate 
ELISAs on samples taken from 120 mL spinner flasks. 

As can be seen from the data, there is a variation 
in antibody secretion of approximately ten fold between 
the highest and lowest clones. This was somewhat unex- 
pected as we anticipated similar expression levels from 
all clones due to the fact the anti-CD20 genes are all 
integrated into the same Desmond marked site. Neverthe- 
less, this observed range in expression extremely small 
in comparison to that seen using any traditional random 
integration method or with our translationally impaired 
vector system. 

Clone 20F4, the highest producing single copy inte- 
grant was selected for further study. Table 2 (below) 
presents ELISA and cell culture data from seven day 
production runs of this clone. 
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Table 2: 







7 Day Production Run Data 


for 20F4 




Day 




% Viable Viable/ml Tx2 (hr) 


xng/L 


pg/c/d 




(x 10 5 ) 






1 




96 3.4 31 


1.3 


4 . 9 


2 




94 6 29 


2 . 5 


3.4 


3 




94 9.9 33 


4.7 


3 .2 


4 




90 17.4 30 


6.8 


3 


5 




73 14 


8.3 




6 




17 3.5 


9.5 






Clone 


20F4 was seeded at 2xl0 5 ml in a 


120ml spinner 






flask 


on day 0. On the following six 


days, cell counts 




were 


taken, doubling times calculated 


and 1ml samples 




of supernatant removed from the flask 


and analyzed 


for 




secreted anti-CD2 0 by ELISA . 






This 


clone is secreting on average, 


3-5pg antibody/- 


cell/day, 


based on this ELISA data. 


This is the same 



level as obtained from other high expressing single copy 
clones obtained previously in our laboratory using the 
previously developed translat ionally impaired random 
20 integration vectors. This result indicates the follow- 
ing : 

(1) that the site in the CHO genome marked by the 
Desmond marking vector is highly transcriptionally ac- 
tive, and therefore represents an excellent site from 
25 which to express recombinant proteins, and 
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(2) that targeting by means of homologous recombi- 
nation can be accomplished using the subject vectors and 
occurs at a frequency high enough to make this system a 
viable and desirable alternative to random integration 
5 methods . 

To further demonstrate the efficacy of this system; 
we have also demonstrated that this site is amplifiable, 
resulting in even higher levels of gene expression and 
protein secretion. Amplification was achieved by plat- 

10 ing serial dilutions of 20F4 cells, starting at a densi- 
ty of 2.5 x 10 4 cells/ml, in 96 well tissue culture 
dishes, and culturing these cells in media (CHO-SSFMII) 
supplemented with 5, 10, 15 or 20nM methotrexate. Anti- 
body secreting clones were screened using standard EL ISA 

15 techniques, and the highest producing clones were ex- 
panded and further analyzed. A summary of this amplifi- 
cation experiment is presented in Table 3 below. 



WO 98/41645 



PCT/US98/03935 



- 40 - 



Table 3: 



Summary of 20F4 Amplification 

Expression Level 
# Wells Expression Level # Wells pg/c/d from 
nM MTX Assayed xng/1 96 well Expanded spinner 



10 
15 
20 



56 
27 
17 



3- 13 
2-14 

4- 11 



4 
3 
1 



10-15 
15-18 
ND 



Methotrexate amplification of 20F4 was set up as de- 
scribed in the text, using the concentrations of metho- 
trexate indicated in the above table. Supernatants 

10 from all surviving 96 well colonies were assayed by 

ELISA, and the range of ariti-CD20 expressed by these 
clones is indicated in column 3. Based on these re- 
sults, the highest producing clones were expanded to 
12 0ml spinners and several ELISAs conducted on the 

15 spinner supernatants to determine the pg/cell/day ex- 

pression levels, reported in column 5. 



The data here clearly demonstrates that this site can be 
amplified in the presence of methotrexate. Clones from 
the 10 and l5nM amplifications were found to produce on 

20 the order of 15-20pg/cell/day . 

A 15nM clone, designated 20F4-15A5, was selected as 
the highest expressing cell line. This clone originated 
from a 96 well plate in which only 22 wells grew, and 
was therefore assumed to have arisen from a single cell. 

25 A 15nM clone, designated 20F4-15A5, was selected as the 
highest expressing cell line. This clone originated 
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from a 96 well plate in which only 22 wells grew, and 
was therefore assumed to have arisen from a single cell. 
The clone was then subjected to a further round of meth- 
otrexate amplification. As described above, serial 
5 dilutions of the culture were plated into 96 well dishes 
and cultured in CHO-SS-FMII medium supplemented with 
200, 300 or 400nM methotrexate. Surviving clones were 
screened by ELISA, and several high producing clones 
were expanded to spinner cultures and further analyzed. 
10 A summary of this second amplification experiment is 
presented in Table 4 . 



Table 4: 

Summary of 20F4 -15A5 Amplification 



nM MTX 



# Wells Expression Level # Wells 
Assayed mg/1 96 well Expanded 



Expression Level 
pg/c/d, spinner 



15 200 67 23-70 1 50-60 

250 86 21-70 4 55-60 

300 81 15-75 3 40-50 

Methotrexate amplifications of 20F4-15A5 were set up 
and assayed as described in the text. The highest 
2 0 producing wells, the numbers of which are indicated in 

column 4, were expanded to 120ml spinner flasks. The 
expression levels of the cell lines derived from these 
wells is recorded as pg/c/d in column 5. 



25 



The highest producing clone came from the 2 50nM metho- 
trexate amplification. The 250nM clone, 20F4 - 15A5-250A6 
originated from a 96 well plate in which only wells 
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grew, and therefore is assumed to have arisen from a 
single cell. Taken together, the data in Tables 3 and 4 
strongly indicates that two rounds of methotrexate am- 
plification are sufficient to reach expression levels of 
5 60pg/cell/day, which is approaching the maximum secre- 
tion capacity of immunoglobulin in mammalian cells 
(Reff, M.E., Curr. Opin. Biotech., 4:573-576 (1993)). 
The ability to reach this secretion capacity with just 
two amplification steps further enhances the utility of 
10 this homologous recombination system. Typically, random 
integration methods require more than two amplification 
steps to reach this expression level and are generally 
less reliable in terms of the ease of amplification. 

Thus, the homologous syscem ofj.ei.s = e e*.~~ <*— 

15 time saving method of achieving high level gene expres- 
sion in mammalian cells. 

EXAMPLE 5 

t^-c ^rm n f A ^tH -Human CT)23 Antibody 
■ir! naamond M^ rK,^ mQ Cells 

20 CD23 is low affinity IgE receptor which mediates 

binding of IgE to B and T lymphocytes (Sutton, B.J., and 
Gould, H.J., Nature, 366:421-428 (1993)). Anti-human 
CD23 monoclonal antibody 5E8 is a human gamma- 1 mono- 
clonal antibody recently cloned and expressed in our 

25 laboratory. This antibody is disclosed in commonly 
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assigned Serial No. 08/803,085, filed on February 20, 
1997 . 

The heavy and light chain genes of 5E8 were cloned 
into the mammalian expression vector N5KG1, a derivative 
of the vector NEOSPLA (Barnett et al, in Antibody Ex- 
pression and Engineering, H.Y Yang and T. Imanaka, eds . , 
pp27-40 (1995)) and two modifications were then made to 
the genes. We have recently observed somewhat higher 
secretion of immunoglobulin light chains compared to 
heavy chains in other expression constructs in the labo- 
ratory (Reff et al, 1997, unpublished observations) . In 
an attempt to compensate for this deficit, we altered 
the 5E8 heavy chain gene by the addition of a stronger 

n «*. n ^/p«^n n por oi orp^rit- i mmediatelv uostream of the 

15 start site. In subsequent steps, a 2 . 9kb DNA fragment 
comprising the 5E8 modified light and heavy chain genes 
was isolated from the N5KG1 vector and inserted into the 
targeting vector Mandy. Preparation of 5E8 -containing 
Molly and electroporation into Desmond 15C9 CHO . cells 
2 0 was essentially as described in the preceding section. 

One modification to the previously described proto- 
col was in the type of culture medium used. Desmond 
marked CHO cells were cultured in protein- free CD-CHO 
medium (Gibco-BRL, catalog # AS21206) supplemented with 
25 3mg/L recombinant insulin (3mg/mL stock, Gibco-BRL, 
catalog # AS22057) and 8mM L-glutamine (200mM stock, 
Gibco-BRL, catalog # 25030-081) . Subsequently, trans- 
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fected cells were selected in the above medium supple- 
mented with 400/ig/mL geneticin. In this experiment, 20 
electroporations were performed and plated into 96 well 
tissue culture dishes. Cells grew and secreted anti- 
5 CD23 in a total of 68 wells, all of which were assumed 
to be clones originating from a single G418 cell. 
Twelve of these wells were expanded to 120ml spinner 
flasks for further analysis. We believe the increased 
number of clones isolated in this experiment (68 com- 

10 pared with 10 for anti-CD20 as described in Example 4) 
is due to a higher cloning efficiency and survival rate 
of cells grown in CD-CHO medium compared with CHO-SS- 
FMII medium. Expression levels for those clones ana- 
lyzed in spinner culture ranged from 0.5-3pg/c/d, in 

15 close agreement with the levels seen for the anti-CD20 
clones. The highest producing anti-CD23 clone, desig- 
nated 4H12, was subjected to methotrexate amplification 
in order to increase its expression levels. This ampli- 
fication was set up in a manner similar to that describ- 

20 ed for the anti-CD20 clone in Example 4. Serial dilu- 
tions of exponentially growing 4H12 cells were plated 
into 96 well tissue culture dishes and grown in CD-CHO 
medium supplemented with 3mg/L> insulin, 8mM glutamine 
and 30, 35 or 40nM methotrexate. A summary of this 

25 amplification experiment is presented in Table 5. 



Table 5: 
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Summary of 2H12 Amplification 

Expression Level 
# Wells Expression Level # Wells pg/c/d from 
nM MTX Assayed mg/1 96 well Expanded spinner 





30 


100 


6-24 


8 


10-25 




35 


64 


4-27 


2 


10-15 


5 


40 


96 


4-20 


1 


ND 



The highest expressing clone obtained was a 30nM clone, 
isolated from a plate on which 22 wells had grown. 
This clone, designated 4H12-3 0G5, was reproducibly 
secreting 18-22pg antibody per cell per day. This is 

10 the same range of expression seen for the first ampli- 

fication of the anti CD20 clone 20F4 (clone 20F4-15A5 
which produced 15-18pg/c/d, as described in Example 4) . 
This data serves to further support the observation 
that amplification at this marked site in CHO is repro- 

15 ducible and efficient. A second amplification of this 

30nM cell line is currently underway. It is antici- 
pated that saturation levels of expression will be 
achievable for the anti-CD2 3 antibody in just two am- 
plification steps, as was the case for anti-CD20. 



20 EXAMPLE 6 

Expression of Immunoadhesin in Desmond Marked CHO Cells 

CTLA-4, a member of the Ig superfamily, is found on 
the surface of T lymphocytes and is thought to play a 
role in antigen-specific T-cell activation (Dariavach et 

25 al, Eur. J. Immunol., 18:1901-1905 (1988); and Linsley 
et al, J. Exp. Med., 174:561-569 (1991)). In order to 
further study the precise role of the CTLA-4 molecule in 
the activation pathway, a soluble fusion protein com- 
prising the extracellular domain of CTLA-4 linked to a 

3 0 truncated form of the human IgGl constant region was 
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created (Linsley et al (IdJ - We have recently 
expressed this CTLA-4 Ig fusion protein in the mammalian 
expression vector BLECH1, a derivative of the plasmid 
NEOSPLA (Barnett et al, in Antibody Expression and Engi- 
5 neering, H.Y Yang and T. Imanaka, eds . , pp27-40 (1995)). 
An 800bp fragment encoding the CTLA-4 Ig was isolated 
from this vector and inserted between the SacII and 
Bglll sites in Molly. 

Preparation of CTLA-4Ig-Molly and electroporation 
10 into Desmond clone 15C9 CHO cells was performed as de- 
scribed in the previous example relating to anti-CD20. 
Twenty electroporations were carried out, and plated 
into 96 well culture dishes as described previously. 

t-i J i_ j — i — nm-r * a a .-i *-i -i -r*/~f val let t.t^**-q -i r»/-\l ?i h a/^ f v*^m +- Vi 

15 96 well plates and carried forward to the 12 0ml spinner 
stage. Southern analyses on genomic DNA isolated from 
each of these clones were then carried out to determine 
how many of the homologous clones contained additional 
random integrants. Genomic DNA was digested with Bglll 

20 and probed with a PCR generated digoxygenin labelled 

probe to the human IgGl constant region. The results of 
this analysis indicated that 85% of the CTLA-4 clones 
are homologous integrants only; the remaining 15% con- 
tained one additional random integrant. This result 

25 corroborates the findings from the expression of anti- 
CD20 discussed above, where 80% of the clones were sin- 
gle homologous integrants. Therefore, we can conclude 
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that this expression system reproducibly yields single 
targeted homologous integrants in at least 80% of all 
clones produced. 

Expression levels for the homologous CT1A4-Ig 
5 clones ranged from 8 - 12pg/cell/day . This is somewhat 
higher than the range reported for anti-CD20 antibody 
and anti-CD23 antibody clones discussed above. However, 
we have previously observed that expression of this 
molecule using the intronic insertion vector system also 
10 resulted in significantly higher expression levels than 
are obtained for immunoglobulins. We are currently 
unable to provide an explanation for this observation. 

EXAMPLE 7 

Targeting Anti-CD20 t o an alternate Desmond Marked CHO 
15 Cell Line 

As we described in a preceding section, we obtained 
5 single copy Desmond marked CHO cell lines (see Figures 
4 and 5) . In order to demonstrate that the success of 
our targeting strategy is not due to some unique proper- 

20 ty of Desmond clone 15C9 and limited only to this clone, 
we introduced anti-CD2 0 Molly into Desmond clone 9B2 
(lane 6 in figure 4, lane 1 in figure 5) . Preparation 
of Molly DNA and electroporation into Desmond 9B2 was 
exactly as described in the previous example pertaining 

25 to anti-CD20. We obtained one homologous integrant from 
this experiment. This clone was expanded to a 120ml 
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spinner flask, where it produced on average 1.2pg anti- 
CD20/cell/day . This is considerably lower expression 
than we observed with Molly targeted into Desmond 15C9. 
However, this was the anticipated result, based on our 
5 northern analysis of the Desmond clones. As can be seen 
in Figure 5, mRNA levels from clone 9B2 are considerably 
lower than those from 15C9, indicating the site in this 
clone is not as transcriptionally active as that in 
15C9. Therefore, this experiment not only demonstrates 
10 the reproducibility of the system - presumably any 

marked Desmond site can be targeted with Molly - it also 
confirms the northern data that the site in Desmond 15C9 
is the most transcriptionally active. 



From the foregoing, it will be appreciated that, 
15 although specific embodiments of the invention have been 
described herein for purposes of illustration, various 
modifications may be made without diverting from the 
scope of the invention. Accordingly, the invention is 
not limited by the appended claims. 
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WHAT IS CLAIMED IS; 



1. A method for inserting a desired DNA at a 
target site in the genome of a mammalian cell which 
comprises the following steps: 
5 (i) transfecting or transforming a mammalian cell 

with a first plasmid ("marker plasmid") containing the 
following sequences: 

(a) a region of DNA that is heterologous to 
the mammalian cell genome which when integrated in the 

10 mammalian cell genome provides a unique site for homolo- 
gous recombination; 

(b) a DNA fragment encoding a portion of a 
first selectable marker protein; and 

(c) at least one other selectable marker DNA 
15 that provides for selection of mammalian cells which 

have been successfully integrated with the marker plas- 
mid; 

(ii) selecting a cell which contain the marker 
plasmid integrated in its genome; 
20 (iii) transfecting or transforming said selected 

cell with a second plasmid ("target plasmid") which 
contains the following sequences: 

(a) a region of DNA that is identical or is 
sufficiently homologous to the unique region in the 
25 marker plasmid such that this region of DNA can recom- 
bine with said DNA via homologous recombination; 
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(b) a DNA fragment encoding a portion of the 
same selectable marker contained in the marker plasmid, 
wherein the active selectable marker protein encoded by 
said DNA is only produced if said fragment is expressed 
in association with the fragment of said selectable 
marker DNA contained in the marker plasmid; and 

(iv) selecting cells which contain the target plas- 
mid integrated at the target site by screening for the 
expression of the first selectable marker protein. 

2. The method of Claim 1, wherein the DNA frag- 
ment encoding a fragment of a first selectable marker is 
an exon of a dominant selectable marker. 

3. The method of Claim 2, wherein the second 
plasmid contains the remaining exons of said first 
selectable marker. 

4. The method of Claim 3, wherein at least one 
DNA encoding a desired protein is inserted between said 
exons of said first selectable marker contained in the 
target plasmid. 

5. The method Claim 4, wherein a DNA encoding a 
dominant selectable marker is further inserted between 
the exons of said first selectable marker contained in 
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the target plasmid to provide for co-amplification of 
the DNA encoding the desired protein. 

6. The method of Claim 3, wherein the first domi- 
nant selectable marker is selected from the group con- 
sisting of neomycin phosphotransferase, histidinol dehy- 
drogenase, dihydrofolate reductase, hygromycin phospho- 
transferase, herpes simplex virus thymidine kinase, 
adenosine deaminase, glutamine synthetase, and 
hypoxanthine- guanine phosphoribosyl transferase. 

7. The method of Claim 4, wherein the desired 
protein is a mammalian protein. 

8. The method of Claim 7, wherein the protein is 
an immunoglobulin. 

9. The method of Claim 1, which further comprises 
15 determining the RNA levels of the selectable marker (c) 

contained in the marker plasmid prior to integration of 
the target vector. 

10. The method of Claim 9, wherein the other 
selectable marker contained in the marker plasmid is a 

20 dominant selectable marker selected from the group con- 
sisting of histidinol dehydrogenase, herpes simplex 
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thymidine kinase, hydromycin phosphotransferase, adeno- 
sine deaminase and glutamine synthetase. 

11. The method of Claim 1, wherein the mammalian 
cell is selected from the group consisting of Chinese 
hamster ovary (CHO) cells, myeloma cells, baby hamster 
kidney cells, COS cells, NSO cells, HeLa cells and NIH 
3T3 cells. 

12. The method of Claim 11, wherein the cell is a 
CHO cell. 

13. The method of Claim 1, wherein the marker 
plasmid contains the third exon of the neomycin phospho- 
transferase gene and the target plasmid contains the 
first two exons of the neomycin phosphotransferase gene. 

14. The method of Claim 1, wherein the marker 

IS plasmid further contains a rare restriction endonuclease 
sequence which is inserted within the region of homolo- 
gy- 

15. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 
is a bacterial DNA, a viral DNA or a synthetic DNA. 



10 



20 
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16. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 
is at least 300 nucleotides. 

17. The method of Claim 16, wherein the unique 

5 region of DNA ranges in size from about 300 nucleotides 
to 20 kilobases. 

18. The method of claim 17, wherein the unique 
region of DNA preferably ranges in size from 2 to 10 
kilobases. 

10 19. The method of Claim 1, wherein the first 

selectable marker DNA is split into at least three 
exons . 

20. The method of Claim 1, wherein the unique 
region of DNA that provides for homologous recombination 

15 is a bacterial DNA, an insect DNA, a viral DNA or a 
synthetic DNA. 

21. The method of Claim 20, wherein the unique 
region of DNA does not contain any functional genes. 

22. A vector system for inserting a desired DNA at 
20 a target site in the genome of a mammalian cell which 

comprises at least the following: 
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(i) a first plasmid ("marker plasmid") containing 
at least the following sequences: 

(a) a region of DNA that is heterologous to 
the mammalian cell genome which when integrated in the 
mammalian cell genome provides a unique site for homolo- 

gous recombination; 

(b) a DNA fragment encoding a portion of a 

first selectable marker protein; and 

(c) at least one other selectable marker DNA 
that provides for selection of mammalian cells which 
have been successfully integrated with the marker plas- 
mid ; and 

(ii) a second plasmid ("target plasmid") which con- 
tains at least the following sequences: 
15 {a) a region of DNA that is identical or is 

sufficiently homologous to the unique region in the 
marker plasmid such that this region of DNA can recom- 
bine with said DNA via homologous recombination; 

(b) a DNA fragment encoding a portion of the 
same selectable marker contained in the marker plasmid, 
wherein the active selectable marker protein encoded by 
said DNA is only produced if said fragment is expressed 
in association with the fragment of said selectable 
marker DNA contained in the marker plasmid. 



20 
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23. The vector system of Claim 22, wherein the DNA 
fragment encoding a fragment of a first selectable mark- 
er is an exon of a dominant selectable marker. 

24 . The vector system of Claim 23 , wherein the 
second plasmid contains the remaining exons of said 
first selectable marker. 



25. The vector system of Claim 24, wherein at 
least one DNA encoding a desired protein is inserted 
between said exons of said first selectable marker con- 

10 tained in the target plasmid. 

26. The vector system of Claim 24, wherein a DNA 
encoding a dominant selectable marker is further insert- 
ed between the exons of said first selectable marker 
contained in the target plasmid to provide for co-ampli- 

15 fication of the DNA encoding the desired protein. 

27. The vector system of Claim 24, wherein the 
first dominant selectable marker is selected from the 
group consisting of neomycin phosphotransferase, 
histidinol dehydrogenase, dihydrof olate reductase, 

20 hygromycin phosphotransferase, herpes simplex virus 

thymidine kinase, adenosine deaminase, glutamine synthe- 
tase, and hypoxanthine -guanine phosphoribosyl transfer- 
ase . 
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28. The vector system of Claim 25, wherein the 
desired protein is a mammalian protein. 

29. The vector system of Claim 28, wherein the 
protein is an immunoglobulin. 

5 30. The vector system of Claim 22, wherein the 

other selectable marker contained in the marker plasmid 
is a dominant selectable marker selected from the group 
consisting of histidinol dehydrogenase, herpes simplex 
thymidine kinase, hydromycin phosphotransferase, adeno- 
10 sine deaminase and glutamine synthetase. 

31. The vector system of Claim 22, which provides 
for insertion of a desired DNA at a targeted site in the 
genome of a mammalian cell selected from the group con- 
sisting of Chinese hamster ovary (CHO) cells, myeloma 

15 cells, baby hamster kidney cells, COS cells, NSO cells, 
HeLa cells and NIH 3T3 cells. 

32. The vector system of Claim 31, wherein the 
mammalian cell is a CHO cell. 

33. The vector system of Claim 22, wherein the 
20 marker plasmid contains the third exon of the neomycin 

phosphotransferase gene and the target plasmid contains 
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the first two exons of the neomycin phosphotransferase 
gene . 

34. The vector system of Claim 22, wherein the 
marker plasmid further contains a rare restriction endo- 
nuclease sequence which is inserted within the region of 
homology . 

35. The vector system of Claim 22, wherein the 
unique region of DNA that provides for homologous recom- 
bination is a bacterial DNA, a viral DNA or a synthetic 
DNA. 

36. The vector system of Claim 22, wherein the 
unique region of DNA (a) contained in the marker plasmid 
vector system that provides for homologous recombination 
is at least 300 nucleotides. 

37. The vector system of Claim 36, wherein the 
unique region of DNA ranges in size from about 300 
nucleotides to 20 kilobases . 

38. The vector system of Claim 37, wherein the 
unique region of DNA preferably ranges in size from 2 to 
10 kilobases. 
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39. The vector system of Claim 22, wherein the 
first selectable marker DNA is split into at least three 
exons . 

40. The vector system of Claim 22, wherein the 
unique region of DNA that provides for homologous recom- 
bination is a bacterial DNA, an insect DNA, a viral DNA 
or a synthetic DNA. 



10 



41. The vector system of Claim 40, wherein the 
unique region of DNA does not contain any functional 
genes . 
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HD = Salmonella HisD Gene 

N3 = Neomycin Phosphotransferase Eacon 3 

D = Murine Dihydrof olate reductase 

E = Cytomegalovirus and SV40 Enhancers 

SA = Splice acceptor 

BT = Mouse Beta Globin Major Promoter 

B = Bovine Growth Hormone Polyadenylation 

S = SV40 Early Polyadenylation 

SV = SV40 Late Polyadenylation 



FIGURE 1A 
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Molly 




D - Dihydrof olate reductase 

Nl = Neomycin Phosphotransferase Eacon 1 

N2 = Neomycin Phosphotransferase Exon 2 

VL = Anti-CD20 Light chain lead er + Variable 

K = Human Kappa Constant 

VH = Anti-CD20 Heavy chain Leader + Variable 

61 — Human Gamma 1 Constant 

HD = Salmonella. Histidinol Dehydrogenase 

E - and SV40 enhancers S = SV40 Origin 

SD = Splice donor SA = Splice acceptor 

C = CW promoter /enhancer 

T = HSV TK promoter and Polyoma enhancers 

BT = Mouse Beta Globin Major Promoter 

SV = SV40 Late Polyadenylation 

B = Bovine Growth Hormone Polyadenylation 



FIGURE 2A 
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FIGURE 2B 
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FIGURE 7 
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FIGURE 4 




FIGURE 5 
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1 330 1340 1350 1360 1370 1380 

AGCAAGGCGG AGGATTGGGA AGACAATAGC AGGCATGCTG GGGATGCGGT GGGCTCTATG 

1 350 1400 1410 1420 1*30 1440 
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ATCcicCC CCACCATCCA GTGCAGGAGC TCGTTATCGC TATGACGGAA CAGGTATTCG 

1750 1760 1770 I 78 ® 1790 l gge 

CTGGTCACTT CGATGGTTTG CCCGGATAAA CGGAACTGGA AAAACTGCTG CTGGTGTTTT 

1810 1820 1830 1840 1850 I860 

GCTTCCGTCA GCGCTGGATG CGGCGTGCGG TCGGCAAAGA CCAGACCGTT CATACAGAAC 

1 R7a 1880 1890 1900 1918 1920 

TGGCGATCGT TCGGCGTATC GCCAAAATCA CCGCCGTAAG CCGACCACGG GTTGCCGTTT 

1<»3B 1948 1950 I960 1978 1988 

TCATCATATT TAATCAGCGA CTGATCCACC CAGTCCCAGA CGAAGCCGCC CTGTAAACGG 

ggatactS! gaaacgcctg ccagtattta ccgaaaccgc caagactgtt acccatcgcg 

?aca 7flfiB 2878 2888 2898 2188 

,ggcgt!5? cgcaaaIS? cagcgg?c2 gtctctccag gtagcgaaag ccattttttg 

■juft 71 2130 2140 2150 2160 

ATGGACCATT TCGGCaS« CGGGAAgSgI TGGTCTTCAT CCACGCGCGC GTACATCGGG 

?i 7a 7t sa 2190 2200 2210 2220 

CAAATAATAT CGGTGgSgT GGTGTCCGCT CCGCCGCCTT CATACTGCAC CGGGC6GGAA 

7»*a 7248 2258 2268 2278 2288 

ggatcgJSS atttgatcca gcgatacagc gcgtcgtgat TAGCGCC6TG GCCTGATTCA 

229B 2308 2318 2328 2338 2348 

TTCCCCAGCG ACCAGATGAT CACACTCGGG TGATTACGAT CGCGCTGCAC CATTCGCGTT 

2358 2360 2378 2388 2398 2488 

ACGCGTTCGC TCATCGCCGG TAGCCAGCGC GGATCATCGG TCAGACGATT CATTGGCACC 

AT6CCGTGGG TTTCAATATT GGCTTCATCC ACCACaSX GGCCGT«« GTCGCACwJ 

Z47B 2480 2498 2588 2518 *Mt 

6TGTACCACA GCGGATGGTT CGGATAATGC GAACAGCGCA CGGCGTTAAA GTTGTTCTGC 

2S38 2548 2558 2568 2578 2588 

TTCATCAGCA GGATATCCTG CACCATCGTC TGCTCATCCA TGACCTGACC ATGCAGAGGA 

2598 2688 2618 2628 2638 2648 
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TGATGCTCCT CACGGTTAAC GCCTCGAATC AGCAACGGCT TGCCGTTCAG CAGCAGCAGA 

2650 2668 2670 2680 2690 2700 

CCATTTTCAA TCCGCACCTC GCGGAAACCG ACATCGCAGG CTTCTGCTTC AATCAGCGTG 

2710 2720 2730 2740 2750 2760 

CCGTCGGCGG TGTGCAGTTC AACCACCGCA CGATAGAGAT TCGGGATTTC GGCGCTCCAC 

2770 2780 2790 2800 2810 2820 

AGTTTCGGGT TTTCGACGTT CAGACGTAGT GTGACGCGAT CGGCATAACC ACCACGCTCA 

2830 2840 2850 2860 2870 2880 

TCGATAATTT CACCGCCGAA AGGCGCGGTG CCGCTGGCGA CCTGCGTTTC ACCCTGCCAT 

2890 2900 2910 2920 2930 2940 

AAAGAAACTG TTACCCGTAG GTAGTCACGC AACTCGCCGC ACATCTGAAC TTCAGCCTCC 

2950 2960 2970 2980 2990 3000 

AGTACAGCGC GGCTGAAATC ATCATTAAAG CGAGTGGCAA CATGGAAATC GCTGATTTGT 

3010 3020 3030 3040 3050 3060 

(•lAGTCGGTT TATGCAGCAA CGAGACGTCA CGGAAAATGC CGCTCATCCG CCACATATCC 

3070 3080 3090 3100 31 ZB 

TGATCTTCCA GATAACTGCC GTCACTCCAG CGCAGCACCA TCACCGCGAG GCGGTTTTCT 

3130 3140 3150 3160 3170 3180 

CCGGCGCGTA AAAATGCGCT CAGGTCAAAT TCAGACGGCA AACGACTGTC CTGGCCGTAA 

3190 3200 3210 32Z0 3230 3240 

CCGACCCAGC GCCCGTTGCA CCACAGATGA AACGCCGAGT TAACGCCATC AAAAATAATT 

3250 3260 3270 3280 3290 3300 

CGCGTCTGGC CTTCCTGTAG CCAGCTTTCA TCAACATTAA ATGTGAGCGA GTAACAACCC 

3310 3320 3330 3340 3350 3360 

GTCGGATTCT CCGTGGGAAC AAACGGCGGA TTGACCGTAA TGGGATAGGT CACGTTGGTG 

3370 3380 3390 3400 3410 3420 

lAGATGGGCG CATCGTAACC GTGCATCTGC CAGTTTGAGG GGACGACGAC AGTATCGGCC 

3430 3440 3450 3469 3470 3480 

TCAGGAAGAT CGCACTCCAG CCAGC7TTCC GGCACCGCTT CTGGTGCCGG AAACCAGGCA 

3490 3500 3510 3520 3530 3540 

AAGCGCCATT CGCCATTCAG GCTGCGCAAC TGTTGGGAAG GGCGATCGGT GCGGGCCTCT 

3550 3560 3570 3580 3590 3600 

TCGCTATTAC GCCAGCTGGC GAAAGGGGGA TGTGCTGCAA GGCGATTAAG TTGGGTAACG 

3610 3620 3630 3640 3650 3660 

CCAGGGTTTT CCCAGTCACG ACGTTGTAAA AC6ACTTAAT CCGTCGAGGG GCTGCCTCGA 

3670 3680 3690 3700 3710 3720 

AGCAAACGAC CTTCCGTTGT GCAGCCAGCG GCGCCTGCGC CGGTGCCCAC AATCGT6CGC 

3730 3740 3750 3760 3770 3780 

GAACAAACTA AACCAGAACA AATTATACCG GCCGCACCGC CGCCACCACC TTCTCCCGTG 

3790 3800 3810 3820 3"0 3840 

CCTAACATTC CAGCGCCTCC ACCACCACCA CCACCATCGA TGTCTGAATT GCCGCCCGCT 

3850 3860 3870 3880 3890 WW 

CCACCAATGC CGACGGAACC TCAACCCGCT GCACCTTTAG ACGACA6ACA ACAATTGTTG 
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Mi aZl «««JE «««38 WC tJS ««kSB ..cctJcTc 

„.«JS c«««35 «.c*KS ««nSS ™c«2«f «m«SS 

M „„mS xccnSS ccccktcc! tc.cc "ct'c accnJK tccccct'ccS 
cc«aSS ccccrcccfc tcuccSS ™.! CTCCCTCC 

CCATTA6TAG ATTT6CC6TC TCAAATGTTA CUCCGCcK «CC.t{S TTCTA.5ot 

ttctciS't ta«..t«« ca<rS tt«««« ccomS! «cccaJtS 

ova A7Qfl 4300 * 31 ® 4328 

\ATAATTC CAAAAAGCTC AACTACAAA? TTGATCGCGG ACGTGTTAGC CGACACAATT 

aataggJ?" gtgtggSa? ggcaaaatcg tcttcggaag caactt™ cgacgagTt 

TGGGACGACG ACGATAaSg GCCTAaSIa GCTAACACGC CCGATgSa! ATATGtJSa 

4490 4500 

gctactJSS gtacctSa? taaggggS agaatggTg gaactgggcg gagttagggg 
cgggatgI" ggagttaggg gcgggaSa? ggttgcS? taattgaIa? gcatgc^ 

. eflfl Afiflfl 4610 ^20 

CATACTTCTG CCTGCTGGgS AGCCTGgIS! CTTTCCACAC CTGGTTGCTG ACTAATTGAG 

4630 4640 46S8 4660 4670 4680 



'GCATtm TGCATACTTC TGCCT6CTGG GGAGCCTGGG GACTTTCCAC ACCCTAACTG 

A770 4730 4740 

ACACAcl??? CACA6AATTA ATTCCCcS! TTATTAATAG TAATCAATTA CGGGGTCATT 

a7«b 4790 4800 

agttcaXgI camiSS agttccJS? tacataTctt acgctaaatg GCCCGCCTGG 

a a an 4850 4860 

CTGACcJS! AACGAcSa «CCC«JSc GTCAATAATG ACGTATGTTC CCATAGTAAC 

jooa 4.QA0 4910 4920 

cccaatISS actttcSS caccixSS ggtggagtat ttacggtaaa ctgcccactt 

AOfifl 4970 4980 

aCCAGT^T CAAGTGTATC ATATGcJaaJ TACGCcJcIt ATTGACGTCA ATGACGGTAA 

ea _ 0 | *A3 a 5040 

.i«cc3S ™!l ccc.GT.1i! g.cctt.tH "ctttcct. cttccct. 

eafia eAAQ 5100 

CTCT.cIt. TT.CTcTtcI CT.TT.cSt «T«t 5 CC 5 ; TTTT«««T .C.TCT« 

C1 . a riiQ 5150 5160 

ccgtggaSJ cggtttgaS cacgggg^t? tccaagStc caccccattg ACGTCAATGG 

CAGTTtIS TGAAGCTTGG CCGGCcTg'c? TTATTTAACG TGTTTACGTC GACTCAATTG 
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5230 5240 5250 5260 5270 5280 

TACACTAACG ACAGTCATGA AAGAAATACA AAAGCGCATA ATATTTTGAA CGACGTCGAA 

5290 5300 5310 5320 5330 5340 

CCTTTATTAC AAAACAAAAC ACAAACGAAT ATCGACAAAG CTAGATTGCT GCTACAAGAT 

5350 5360 5370 5380 5390 5400 

TTGGCAAGTT TTGTGGCGTT GAGCGAAAAT CCATTAGATA GTCCAGCCAT CGGTTCGGAA 

5410 5420 5430 5440 5450 5460 

AAACAACCCT TGTTTGAAAC TAATCGAAAC CTATTTTACA AATCTATTGA GGATTTAATA 

5470 5480 5490 5500 5510 5520 

TTTAAATTCA GATATAAAGA CGCTGAAAAT CATTTGATTT TCGCTCTAAC ATACCACCCT 

5530 5540 5550 5560 5570 5580 

AAAGATTATA AATTTAATGA ATTATTAAAA TACATCAGCA ACTATATATT GATAGACATT 

5590 5600 5610 5620 5630 5640 

.AGTTTGT GATATTAGTT TGTGCGTCTC ATTACAATGG CTGTTATTTT TAACAACAAA 

5650 5660 5670 5680 5690 5700 

CAACTGCTCG CAGACAATAG TATAGAAAAG GGAGGTGAAC TGTTTTTGTT TAACGGTTCG 

5710 5720 5730 5740 5750 5760 

TACAACATTT TGGAAAGTTA TGTTAATCCG GTGCTCCTAA AAAATCGTGT AATTGAACTA 

5770 5780 5790 58e ® 5810 5820 

GAAGAAGCTG CGTACTATGC CGGCAACATA TTGTACAAAA CCGACGATCC CAAATTCATT 

5830 5840 5850 5860 5870 5880 

GATTATATAA ATTTAATAAT TAAAGCAACA CACTCCGAAG AACTACCAGA AAATAGCACT 

5890 5900 5910 5920 5930 5940 

GTTGTAAATT ACA GAAAAAC TATGCGCAGC GGTACTATAC ACCCCATTAA AAAAGACATA 

5950 5960 5970 5980 5990 6000 

. f ATTTATG ACAACAAAAA ATTTACTCTA TACGATAGAT ACA TATA TGG ATACGATAAT 

6010 6020 6030 6040 6050 6060 

AACTATCTTA ATTTTTATGA GGAGAAAAAT GAA AAA GAGA AGGAATACGA AGAAGAAGAC 

6070 6080 6090 6100 6110 6120 

GACAAGGCGT CTAGTTTATG TGAAAATAAA ATTATATTGT CGCAAATTAA CTGTGAATCA 

6130 6140 6150 61W 6170 6180 

TTTGAAAATG ATTTTAAATA TTACCTCAGC GATTATAACT ACGCGTTTTC AATTATAGAT 

6190 6200 6210 6220 6230 ^^^240 

AATACTACAA ATGTTCTTGT TGCGTTTGGT TTGTATCCTT AATAAAAAAC AAATTTGACA 

6250 6260 6270 6280 6290 6300 

TTTATAATTG TTTTATTATT CAATAATTAC AAATAGGATT GAGACCCTTG CAGTTGCCA6 

6310 6320 6330 6340 6350 6360 

CAAACGGACA GAGCTTGTCG AGGAGAGTTG TTGATTCATT GTTTGCCTCC CTGCTGCGGT 

6370 6380 6390 6400 6410 6420 

ttttcaccga agttcatgcc agtccagcgt ttttgcagca gaaaagccgc cgacttcggt 

6430 6440 6450 6460 6470 6480 

ttgcggtcgc gagtgaagat ccctttcttg ttaccgccaa cgcgcaatat gccttgcgag 

6490 6500 6510 6520 6530 6540 
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GTCGCAAAAT CGGCGAAATT CCATACCTGT TCACCGACGA CGGCGCTGAC GCGATCAAAG 

6550 6560 6570 6580 6590 6600 

ACGCGGTGAT ACATATCCAG CCATGCACAC TGATACTCTT CACTCCACAT GTCGGTGTAC 

6610 6620 6630 6640 6650 6660 

ATTGAGTGCA GCCCGGCTAA CGTATCCACG CCGTATTCGG TGATGATAAT CGGCTGATGC 

6670 6680 6690 6700 6710 6720 

AGTTTCTCCT GCCAGGCCAG AAGTTCTTTT TCCAGTACCT TCTCTGCCGT TTCCAAATCG 

6730 6740 6750 6760 6770 6780 

CCGCTTTGGA CATACCATCC GTAATAACGG TTCAGGCACA GCACATCAAA GAGATCGCTG 

6790 6800 6810 6820 6830 6840 

ATGGTATCGG TGTGAGCGTC GCAGAACATT ACATTGACGC AGGTGATCGG ACGCGTCGGG 

6850 6860 6870 6880 6890 6900 

TCGAGTTTAC GCGTTGCTTC CGCCAGTGGC GCGAAATATT CCCGTGCACC TTGCGGACGG 

6910 6920 6930 6940 6950 6960 

GTATCCGGTT CGTTGGCAAT ACTCCACATC ACCACGCTTG GCTCGTTTTT GTCACGCGCT 

6970 6980 6990 7000 7010 7020 

ATCAGCTCTT TAATCGCCTG TAAGTGCGCT TGCTGAGTTT CCCCGTTGAC TGCCTCTTCG 

7030 7040 7050 7060 7070 7080 

CTGTACAGTT CTTTCGGCTT GTTGCCCGCT TCGAAACCAA TGCCTAAAGA GAGGTTAAAG 

7090 7100 7110 7120 7130 7140 

CCGACAGCAG CAGTTTCATC AATCACCACG ATGCCATGTT CATCTGCCCA GTCGAGCATC 

7150 7160 7170 7180 7190 7200 

TCTTCAGCGT AAGGGTAATG CGAGGTACGG TAGGAGTTGG CCCCAATCCA GTCCATTAAT 

7210 7220 7230 7240 7250 7260 

GCGTGGTCGT GCACCATCAG CACGTTATCG AATCCTTTGC CAC6CAAGTC CGCATCTTCA 

7270 7280 7290 7300 7310 7320 

TGACGACCAA AGCCAGTAAA GTAGAACGGT TTGTGGTTAA TCAGGAACTG TTCCCCCTTC 

7330 7340 7350 7360 7370 738 0 

ACTGCCACTG ACCGGATGCC GACGCGAAGC GGGTAGATAT CACACTCTGT CTGGCTTTTG 

7390 7400 7410 7420 7430 7440 

GCTGTGACGC ACAGTTCATA 6AGATAACCT TCACCCGGTT GCCAGAGCTG CGGATTCACC 

7450 7460 7470 7480 7490 7500 

ACTTGCAAAC TCCCGCTAGT GCCTTGTCCA GTTGCAACCA CCTGTTGATC CGCATCACGC 

7510 7520 7530 7540 7550 7S6i 

AGTTCAACGC TGACATCACC ATTGGCCACC ACCTCCCAGT CAACAGACGC GTGGTTACAG 

7570 7580 7590 7600 7610 7620 

TCTTGCGCGA CATGCGTCAC CACGGTGATA TCGTCCACCC AGGTGTTCGG CGTGGTGTAG 

7630 7640 7650 7660 7670 7680 

AGCATTACGC TGCGATGGAT TCCGGCATAG TTAAAGAAAT CATGGAAGTA AGACTGCTTT 

7690 7700 7710 7720 7730 7740 

TTCTTGCCGT TTTCGTCGGT AATCACCATT CCCGGCGGGA TAGTCTGCCA GTTCAGTTCG 

7750 7760 7770 7780 7790 7800 

TTGTTCACAC AAACGGTGAT ACCCCTCGAC GGATTAAAGA CTTCAAGCGG TCAACTATGA 
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7810 7820 7830 7840 7850 7860 

AGAAGTGTTC GTCTTCGTCC CAGTAAGCTA TGTCTCCAGA ATGTAGCCAT CCATCCTTGT 

7870 7880 7890 7900 7910 7920 

CAATCAAGGC GTTGGTCGCT TCCGGATTGT TTACATAACC GGACATAATC ATAGGTCCTC 

7930 7940 7950 7960 7970 7980 

TGACACATAA TTCGCCTCTC TGATTAACGC CCAGCGTTTT CCCGGTATCC AGATCCACAA 

7990 8000 8010 3020 8030 8040 

CCTTCGCTTC AAAAAATGGA ACAACTTTAC CGACCGCGCC CGGTTTATCA TCCCCCTCGG 

8050 8060 8070 8080 8090 8100 

GTCTAATCAG AATAGCTGAT GTAGTCTCAG TGAGCCCATA TCCTTGTCGT ATCCCTGGAA 

8110 8120 8130 8140 81S0 8160 

GATGGAAGCG TTTTGCAACC GCTTCCCCGA CTTCTTTCGA AAGAGGTGCG CCCCCAGAAG 

8170 8180 8190 8200 8210 8220 

MTTCGTG TAAATTAGAT AAATCGTATT TGTCAATCAG AGTGCTTTTG GCGAAGAATG 

8230 8240 8250 3260 8270 8280 

AAAATAGGGT TGGTACTAGC AACGCACTTT GAATTTTGTA ATCCTGAAGG GATCGTAAAA 

8290 8300 8310 8320 3330 8340 

ACAGCTCTTC TTCAAATCTA TACATTAAGA CGACTCGAAA TCCACATATC AAATATCCGA 

8350 8360 8370 8380 8390 8400 

GTGTAGTAAA CATTCCAAAA CCGTGATGGA ATGGAACAAC ACTTAAAATC GCAGTATCCG 

8410 8420 8430 8440 8450 8460 

GAATGATTTG ATTGCCAAAA ATAGGATCTC TGGCATGCGA GAATCTGACG CAGGCAGTTC 

8470 8480 8490 8500 3510 _ 

TATGCGGAAG GGCCACACCC TTAGGTAACC CAGTA6ATCC AGAGGAATTG TTTTGTCACG 

8530 8540 8550 8560 8570 8580 

CAAAGGAC TCTGGTACAA AATCGTATTC ATTAAAACCG GGAGGTAGAT GAGATGTGAC 

8590 8600 8610 862 0 8630 8640 

GAACGTGTAC ATCGACTGAA ATCCCTGGTA ATCCGTTTTA GAATCCATGA TAATAATTTT 

8650 8660 8670 86S0 8690 8700 

CTGGATTATT GGTAATTTTT TTTGCACGTT CAAAATTTTT TGCAACCCCT TTTTGGAAAC 

8710 8720 8730 3740 8750 

AAACACTACG GTAGGCTGCG AAATGTTCAT ACTGTTGACC AATTCACGTT CATTATAAAT 

8770 8780 3790 8800 8810 

GTCGTTCGCG GGCGCAACTG CAACTCCGAT AAATAACGCG CCCAACACCG GCATAAAGAA 

8830 8840 8850 8860 8870 8880 

TTGAAGAGAG TTTTCACTGC ATACGACGAT TCTGTGATTT GTATTCAGCC CATATCGTTT 

8890 8900 8910 8920 8930 3940 

CATAGCTTCT GCCAACCGAA CGGACATTTC GAAGTATTCC GCGTACGTGA TGTTCACCTC 

8950 8960 8970 8980 8990 

GATATGTGCA TCTGTAAAAG GAATTGTTCC AGGAACCAGG GCGTATCTCT TCATAGCCTT 

9010 9020 9030 9040 9050 9060 

ATGCAGTTGC TCTCCAGCGG TTCCATCCTC TAGCTTTGCT TCTCAATTTC TTATTTGCAT 

9070 9080 9090 9100 9110 9120 

AATGAGAAAA AAAGGAAAAT TAATTTTAAC ACCAATTCAG TAGTTGATTG AGCAAATGCG 
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9130 9140 9150 9160 9170 9188 

TTGCCAAAAA GGATGCTTTA GAGACAGTGT TCTCTGCACA GATAAGGACA AACATTATTC 

9190 9Z00 9210 9220 9230 9240 

AGAGGGAGTA CCCAGAGCTG AGACTCCTAA GCCAGTGAGT GGCACAGCAT CCAGGGAGAA 

9250 9260 9270 9280 9290 9300 

ATATGCTTGT CATCACCGAA GCCTGATTCC GTAGAGCCAC ACCCTGGTAA GGGCCAATCT 

9310 9320 9330 9340 9350 9360 

GCTCACACAG GATAGAGAGG GCAGGAGCCA GGGCAGAGCA TATAAGGTGA GGTAGGATCA 

9370 9380 9390 9400 9410 9420 

GTTGCTCCTC ACATTTGCTT CTGACATAGT TGTGTT6GGA GCTTGGATCG ATCCACCATG 

9430 9440 9450 9460 9470 9480 

GGCTTCAATA CCCTGATTGA CTGGAACAGC TGTAGCCCTG AACAGCA6CG TGCGCTGCTG 

9490 9S0O 9510 9520 9530 ,^ 9540 

k CGTCCGG CGATTTCCGC CTCTGACAGT ATTACCC6GA CGGTCAGCGA TATTTTGGAT 

9550 9560 9570 9580 9590 9600 

AATGTAAAAA CGCGCGGTGA CGATGCCCTG CGTGAATACA GCGCTAAATT TGATAAAACA 

9610 9620 9630 9640 9650 9660 

GAAGTGACAG CGCTACGCGT CACCCCTGAA GAGATCGCCG CCGCCGGCGC GCGTCTGAGC 

9670 9680 9690 9700 9710 *™ 

GACGAATTAA AACAGGCGAT GACCGCTGCC GTCAAAAATA TTGAAACGTT CCATTCCGCG 

9730 9740 97S0 9760 9770 97 *?. 

CAGACGCTAC CGCCTGTAGA TGTGGAAACC CAGCCAGGCG TGCGTTGCCA GCAGGTTACG 

9790 9800 9810 9820 9830 9840 

CGTCCCGTCT CGTCTGTCGG TCTGTATATT CCCGGCGGCT CGGCTCCGCT CTTCTCAACG 

9850 9860 9870 9880 9890 _ 

u jCTGATGC tggcgacgcc ggcgcgcatt gcgggatgcc AGAAG6TGGT tctgtgctcg 

9910 9920 9930 9940 9950 9960 

CCGCCGCCCA TCGCTGATGA AATCCTCTAT GCGGCGCAAC TGTGTGGCGT GCAGGAAATC 

9970 9980 9990 10000 10010 10920 

TTTAACGTCG GCGGCGCGCA GGCGATTGCC GCTCTGGCCT TCGGCAGCGA GTCCGTACCG 

10930 10040 10050 10060 19970 *•■«• 

AAAGTGGATA AAATTTTTGG CCCCGGCAAC GCCTTTGTAA CCGAAGCCAA ACGTCAGGTC 

10090 10190 10110 10120 l 0 " 0 .-.-IKS 

AGCCAGCGTC TCGACGGCGC GGCTATCGAT AT6CCAGCCG GGCCGTCTGA AGTACTGGTG 

19150 19160 10170 19180 ""O „, rr }£2! 

ATCGCAGACA GCGGCGCAAC ACCGGATTTC GTCGCTTCTG ACCTGCTCTC CCAGGCTGAG 

10210 10220 10230 10240 19250 

CACGGCCCGG ATTCCCAGGT GATCCTGCTG ACGCCTGATG CTGACATTGC CCGCAAG6TG 

10270 10280 10290 10300 19310 fr ™*** 

GCGGAG6CGG TAGAACGTCA ACTGGCGGAA CTGCCGCGCG CGGACACCGC CCGGCAGGtt 

10330 10340 10350 10360 10370 

CTGAGCGCCA GTCGTCTGAT TGTGACCAAA GATTTAGCGC AGTGCGTCGC CATCTUAAi 



10390 10400 10410 10420 10438 
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CAGTATGGGC CGGAACACTT AATCATCCAG ACGCGCAATG CGCGCGATTT GGTGGATGCG 

18458 10468 18478 10480 1M£ 

ATTACCAGCG CAGGCTCGGT ATTTCTCGGC GACTGGTCGC CGGAATCCGC CGGTGATTAC 

•CIkSS CCU.2S? TTT.cS T»T«SS Cr^SSR TTCaSB 
tKT riS2 ATTTCCA6AA AC.UtSTc GTTCAGGAAC tacS GGGCTTTTCC 

gctctg'gS! caa«attS aacatt'cgcg ccggcagSaI gtctga'cSc ccataaaaat 
tgcgcgS cgccctS'g gagcaaISt gagccS aacactSX 
gcgtcgctg! «ttJS5 gaaaatgtS ku&S gatccSgaI! tggatJaSt 

.^tU .rmSS .caJSS GAATgES uui",!! TmiffiR 

AAATTTGTCA TGCTATTGCT TTATTTCTAA CCATTATAAG CTGCAATAAA CAAGTTAACA 
ACA.CaJtt! CATTCATT7T ATfTTTCACG TTCAwSS ^ .TTTTTtSI 

ta . t Tnn crass kt«ss: a-iss r™ss «««ss 

11flW 110fifl iift7o iio80 liwe 11100 

CCCCCOCCT GCCCCmS UCTGtSS TCCCTCCTTT TTCCTGCAGG CTCAAGGCGC 
1117CI 11130 11140 11150 11160 

fiCATcactt CGSCfiSS ctcctSS! cccatggcga tgcctgcttg ccgaatatca 

H17CI ni ftfl 11190 11200 11210 11220 

TGGTGSS TCSCcSSf TCTGGATTCA TCGACTGTGG CCGGCTGGGT GTGGCGGACC 

11740 11250 11260 11270 11280 

OTAtSu CATA6CGTTG GCTAcKSl ATATTGCTGA AGAGCTTGGC GGCGAATGGG 

cooss ta«ctJs; ™ss .t™ 

n*«;a 1136ft 11370 11380 11390 11400 

ATCGCCTTCT TGACgK™ TTCTgIgS GACTCTGGGG TTCGAAATGA CCGACCAAGC 

1147ft 11430 11440 11450 11460 

gkgccHSc ctgccItca" gagatttoa ttccaccgcc gccttctatg aaaggttsgg 
maSK gttttcSgS gatgaSctc ca.cS «ckSS 

11530 11540 11S50 11560 11570 11580 

GGAGTTCTTC GCCCACCCCA ACTTGTTTAT TGCAGCTTAT AATGGTTACA AATAAAGCAA 

11590 11600 11610 11620 11630 11640 

TAGCATCACA AATTTCACAA ATAAAGCATT TTTTTCACTG CATTCTAGTT GTGGTTTGTC 

CU«£Sg UfflSB ATCATGtStG GATCgSSc «™! WCcgS 
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11710 11720 t 11J3J c "JS cACacSS ACGCGGcJS 

TCTAGACTTG GCAGAACATA TCCATCGCGT CCGCCATCTL la^av^v. 

iitq« 11800 H810 U?2« 

nraSa? «mS& wa£S kc«SK.t «™< ra ™<" 

^ 11 »50 11860 11870 11880 

ccn^S ckbSSS TtcciSK ar^cU ™™« 
««S35 caaSS — S maSS ™SSS 

naoa 11990 12000 

t^iSS tctttcctaa uk&S ccJKR »««™ M «mm 
recced ancSS «™8S tk-SS ua.SK xanSS 
«u£S «m3B nenfflS ««3SS a™SS 

«™$S naJSS t«&£ -aSffi «aiSK «c«t?tS 

4 a 1 7770 12230 12240 

x« C ci™ ctctcSTc ,i«nSt tt.c«cSt «uu«ut cccccma 
«u3£ «»S5 awSS cnoSffi c»t«S 
«a<SS u«SR <™S ««£S ««S omSS 
kk&S <m»fS ««£!! «mSS «m«8S 

T<«£% TCHaKS MOtSS ««™« «™^ S 
C««S ««cS'c ««£SS ««SS 

«« c s « T «ss; TuccSfd »t.«ss ™»s ™<8S 

12650 12660 

OwSSt C«^S! 1-oSE CC»TAT«K5 T«T««T« ««««« 

c^uSS uinSB t««HS? ttcbSS T«aa3! »ct«SS! 
crc^S? «a£S «™ mocSS ".ccSK »«^S 

««S5 wjss «««3S ««as 3 « 

WC cSIS nc^gET TTmSS ac«&?< aacSS 

laaSK c«£3 <n»JSB ««S5 kmJSB «»««cS 
ccnJS? cma35 acrcSS? <wc£S «cttK 
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13030 
ATACCTGTCC 

13090 

GTATCTCAGT 

13150 
TCAGCCCGAC 

13210 
CGACTTATCC 

13270 
CGGTGCTACA 

13330 
TGGTATCTGC 

13390 
^CAAACAA 

13450 
CAGAAAAAAA 



hum 13050 13060 13070 13080 

GCCmClK CTTCcSK CGTGGCGCTT TCTCATAGCT CACGCTGTAG 

nifta ***** 13120 13130 13140 

TCGGTgSK TCCTTWa! CAAGCTGGGC TGTGTGCACG AACCCCCCGT 

ivica 13170 13180 13190 13200 

CGCTG«c5 TATCcSH CTATCGTCTT GAGTCCAACC CGGTAAGACA 

i???a 13240 13250 13260 

CCACtSS! aGCCAaS TAACAGGm AGCAGAGCGA GGTATGTAGG 

13290 13300 13310 13320 

GAGTTCTTGA AGTGGTGGCC TAACTACGGC TACACTAGAA GGACAGTATT 

13350 13360 13370 13380 

GCTCTGCTGA MCttSSS CTTCGgJaAA AGAGTTGGTA GCTCTTGATC 

„«cSS mnSS ~»SS *™ 

GGATCTCAAG AAGATCCTTT GATCTTTTCT ACGGGGtK ACGCTCAGTG 



13510 
GAACGAAAAC 

13570 
GATCCTTTTA 

13630 
GTCTGACAGT 

13690 
TTCATCCATA 

13750 

. CTGGCCCC 

13810 
AGCAATAAAC 

13870 
CTCCATCCAG 

13930 
TTTGCGCAAC 



i«7« 13S3B 13540 13550 13560 

TCACGTTAAG GGATmS CATGAGATTA TCAAAAAGGA TCTTCACCTA 

i?c»a i»q<i 13600 13610 13620 

„™S2! aUKirSS «TC«k"» »«T»T.T«T t AGTAAACTTC 

T.cal 3 T S? ,»«££ ««8R ™*S!5 ™»8S 

.„.. a 1372B 13730 13748 

crrecoS tccccSE? gtagauaIt acgatacggg agggcttacc 

AGTGCTGCAA TGATACCGCG AGACcJTcS TOCcSS CAGATTTATC 

CAGCciS?; gaaggg'ccS gcgcaSS cctccSS crmSS 
tctatSaS «t«SB McnSS agtagSS? acniSS 
gttgttgcca ttgctgSgS chc&S tcacgS'cg! ccmSK 



13990 
GGCTTCATTC 

14050 
CAAAAAAGCG 

14110 
GTTATCACTC 

14170 
ATGCTTTTCT 

14230 
ACCGAGTTCC 

14290 



UO&S «JffiS «.tJSS «.1^SS 

cnuSS KK&g «TC^; «u£S T««1S"«? 

.tc^s ««£5 ~«ss? «msa amiss 
m^ss c««ffis w-iss 

1l7in 1A75a U260 14270 14280 

TCTTGCCCGG CCTCMOCG GGATaIScC GCGCCACATA GCAGAACTTT 



14300 



14310 



14320 



14330 



14340 
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AAAAGTGCTC 


ATCATTGGAA 


AACGTTCTTC 


14350 


14360 


14370 


GTTGAGATCC 


AGTTCGATGT 


AACCCACTCG 


14410 


144Z0 


14430 


TTTCACCAGC 


GTTTCTGGGT 


GAGCAAAAAC 


14470 


14480 


14490 


AAGGGCGACA 


CGGAAATGTT 


GAATACTCAT 


14530 


14540 


14550 


TTATCAGGGT 


TATTGTCTCA 


TGAGCGGATA 


14590 


14600 


14610 


AATAGGGGTT 


CCGCGCACAT 


TTCCCCGAAA 


14650 


14660 


14670 


TATCATGACA 


TTAACCTATA 


AAAATAGGCG 
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GGGGCGAAAA CTCTCAAGGA TCTTACCGCT 

14380 14390 14400 

TGCACCCAAC TGATCTTCAG CATCTTTTAC 

14440 14450 14460 

AGGAAGGCAA AATGCCGCAA AAAAGGGAAT 

14500 14510 14520 

ACTCTTCCTT TTTCAATATT ATTGAAGCAT 

14560 14570 14580 

CATATTTGAA TGTATTTAGA AAAATAAACA 

14620 14630 14640 

AGTGCCACCT GACGTCTAAG AAACCATTAT 

14680 14690 14700 

TATCACGAGG CCCTTTCGTC TTCAAGAA . . 
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TTAATTAAGG GGCGGAGAA* GGGCGGAACT GGGCGGAGTT AGGGGCGGGA TGGGCGCACT 
__ <(| an 100 1W 120 

taggggcggg" actatggttg ctgactaatt gagatgcatg ctttgcatac ttctgcctgc 
tggggag"c? ggggactttc cacacctgg? tgctgactS! ttgagatgS tgctttgSt 

ACTTCTGCCT GCTGGGGAGC CTGGGGACTT TCCACAUCT AACTGACACA CATTCCACAG 
AATTAATTCC CCTAGTTATT AATAGTaJk AATTACGgS TCATTAGm ATAGCCCA?! 

TATGGAGTTC CGCGTTaS? AACTTACGG? AAATGGC^ CCTGGCTGaI CGCCCAACGA 

^ttft 390 400 4M 

:CCGCc2 TTGACGTCAA TAATGACGTA TGTTCCCATA GTAACGCCAA TAG6GACTTT 

■CCATTGACGT CAATGGgKS AGTATTtIcS GTAAACTGC? UCTTGgSS TACATCAAGT 

gtatcatI™ ccaagtac" ccccta™ cgtcaat"! ggtaaatgg? ccgcctggca 

,. n cca 57B 530 590 600 

ttatgcccag tacatgacct tatgggactt tcctacttgg cagtacatct acgtattagt 

«a 630 640 650 65* 

catcgctS" accatgg?2 tgcggtt^tg gcagtacatc aatgggcgtg gatagcggtt 

c-to AHA 690 700 710 7Z ® 

tgactca!gS ggatttcca! gtctccaccc cattgacgtc aatggcagtt tgttttgaag 

71ft 7*ft 760 778 

pggccgI? cagcttt™ taacgtgSt acgtcgagtc aattgtacac taacgacagt 
gatgaaaga! atacaaa"? gcataata" ttgaacgacg tcgaaccttt attacaaaa? 
aaaacacIS cgaatatJS caaagctag"! ttgctgctac aagatttggc aagttttgtg 

Q3fl 940 950 960 

«cgttga«? aaaatccaS agatagtcca gccatccgtt cggaaaaaca acccttgttt 

ft , fl aftft aqa 1000 1010 1® 2 £ 

gaaactaSSS gaaacctatt ttacaaatct attgaggatt taatatttaa attcagatat 

AAAGACGCTG AAAATCATTT GATTTtSct CTAACATACC ACCCTaHS TTATAAATTT 

aatgaattat taaaatacat cagcaacS? atattgatag acatttSag tttgtgIS? 

« il90 1200 

TAGTTTGTGC GTCTCATTAC AATGGcSS ATTTTTAACA ACAAACAACT GCTCGCAGAC 

AATAGTATAG AAAA6GW« TGAACTGTTT TTGTTTAAC? GTTCGTiS! CATTTTGGAA 

agttatgS! atccggtgct gctaaaaaat ggtgtaattg aactagaaS agctgcgtac 



1 
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1338 1340 13Se 13* _«5 «2 



TATCCCcfc! ACATATTGTA CAAAACclTc GATCCCAAAT TCATTGATTA TATAAATTTA 
ATAATtISJ CAACAciS? CGAA6AACTA CUOiJSS GCACTGTTGT AAATTACAGA 
AAAACTATGC GCAGCG^C TATAC*^ ATTAAAAAAS ACATATATAT TTATGACAAC 
AAAAAATTTA CTCTAtIS! TAGATAcS TATGGA^G ATAATaIS! TGTTAA^ 

tatgaggaS aaaatgaaaa agagaaggaa tacgaagaI; aagacgacaa ggcgtcSg? 

a l6 c0 1679 1680 

ttaktS muA^T .tokSS .rrucrao «taTrr« ««™tttt 

-»^S TCUaSS ™™£ TTTTCAATTA T.^SJ IW-SS 

«-r»« 17RA 1790 1800 

cttgttgS? ttggtt^g?; tcgttaua! aaaacaaatt tgacatttat aattgtttta 

1I1A 18S0 18W 

TOTTciS! imaSS wn£S ccmciS! KU5 a«Ac =«a««r 
tctc^IS ^ Tan^S cacccSS ««nSS «»»SS 

.. M iqfifl 1970 1980 

.WCaSS A6CCTTTTTG C«C«]S ««CC«CT TCCCTTTfiCC 

UttltgR TCTTGTTACC ««Ac!S! »T»T t SS ««« ^ «-t5S 

7090 2190 

„ncS ccin&S «w<25 cmoSS ««««« «« 

— -maci 2150 2160 

tccaccSS aacrSS CTCTTCACTC cacatgSS tgtacattga gtgcagcccg 

14M 970 a 2210 2220 

«tucSIt ca«c8R rrccsron^ .TUtdfa «™»«nr ctcctccuc 
«m£S cmnSS iKcng! ««nSS uiwSf t™»«™ 
cntaSS »™ i~«SB ,raCT J 

t «T«g S cI .c.tt.S't? <u«a?™< t««tK? ttt»c«St 

kttcJS! «t««!S! »t»ttc??" .accnK! carmIS! 
.cui£3 .arcJS'c .cm£S rnr^S C ™? nanSK 
ndS! ccccTTccTC Mm£ ™a??c? ctt«c?"» «n<S5 

2620 2630 2640 



2590 2600 2610 
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atmittc ckcttcoaa accmtscct aaa«««t tumcmac .sutcAtrr 
tcatcaaSS cca«a?" mmSS «c««™ xcmSS 

:>7*a 2740 2750 2760 

2710 2720 2730 TTAATGCGTG GTCGTGCACC 

TAATGCGAGG TACGGTAGGA GTTGGCCCCA ATCCAGTCCA T i aa 

™« 2800 2810 2820 

,JS! »lt«S! TTTCCCACCC AACTCcIcAT CTTCATaC. A«AAA«C 

„ cfl 78fi A 2870 2880 

™„S umiSS amiSS uamKc core.™ cac™cc« 

<.«<■<» 7Q70 2930 2940 

atcccJ!'« «»««™ wntS! innA. ttttkctct aaacur 

7 aga 2990 3000 

— ATAtAMT UCCnSS «mS3 A«™CAT TCACCACTTS CAAA<TCC« 

3040 3050 3060 

CTAGTG^ GTCCA^C AACCACCTGT TGATCCGCAT CACGCAGTTC AACGCTGACA 

.... 3100 3110 3120 

tcaccaS ccacca?"? caacl!!! ««S«t tkastok «*anc 

„ e(l ^ l6 a 3170 3180 

gtcaccacgg tgatat 3 cg4! accaSS ttcggcSgg tgtagagcat tacgctgcga 

* 3220 3230 3240 

T»nSS CATACTTAAA "CT CCTTTTTCTT WCOTTTC. 

, 2gB 3290 3300 

t« CT JS cam??!? c««t!1?! tsccac&a otccttstt cau««« 

3340 3350 336® 

-cataIS'c t«.««t? Auaaro «« 5 al».c tat««a«a« «mncn 
arccdS «eni8S «muSR «aTcl«Tc tmcSS aa«c£S? 

i^eA 3*£0 3470 3480 

racntS AtTtrrSS wccAS «"™« " T " Tr " C 

3570 3530 3540 

ctctctIS? ucwSS mm£S tatccaIatc acAcarc «ttcaaaaa 

„„ 3e«a 3590 3600 

.T S «aS'c TTTACcSS ««£? T»alS«C CICOTI* »C.«»T.C 

-xajA 3650 3660 

CTCATGTAGT CT«cS CCATATc'cxT ntmSS T«AA« TO AACCGTTTTG 

CAAccecm ccc«^?c? mJSS m«5S —«SS ~ ? 

, 7fift 3770 3780 

lUnJS CT.TTTCtS ATCAGACTCC TTTTWCGAA SAATSAAAAT «n» 

CT«oJS'c ACTTTCAATT TTCTAATCCT ««« US TAA.A.C.S TCTTC^cS 

« 3880 3890 3900 

atctatIS? taagacgaI? cgaaat'cS? atatcaIata tccgagtgta gtaaacattc 



WO 98/41645 PCT/US98/03935 

24 / 51 

DNASIS 
Molly Lark 

3919 39Z9 3939 3949 ..„.»• 3969 



caaaacc™ atggaa?gga" acaacacSa aaatcgcagt atccggaatg atttgattgc 

4000 4010 4020 

CAAAAATAGG ATCTClSS TGCGAGAATC TGACGCAGGC AGTTCTATGC GGAAGGGCCA 

» » #» AAfiO 4070 4080 

CACCClSS TAACCCAGTA GATCCAGAGG AATTGTTTTG TCACGATCAA AGGACTCTGG 

TACAAAATCG TATTCATTAA AACCGGGAGG TAGATGAW GTGACGAACG TGTACATCGA 

CTGAAAKCC TGGTAaSIg TTTTAaiS CATGATMTA ATTTia«A TTATTG™ 

**>r*f% 4740 4250 4260 

TmrnS! acgttca2a ttttttgcaa ccccttU gaaacaaaca ctacggtagg 

., ftfl **aa 4310 4320 

■XOaSS TITATaSS TCACCAlm ACCTTCATTA TAAATGTCGT TCGCGGGCGC 

aactgcSS coutaSX acgcgcSa caccggcata aagaa™! gagagtS? 

ACTGCATAC6 ACGATTCTXT GATTTGTATT CAGCCCaIa! CGTTTCATAG CTTCTG^ 

AAjta 4490 4S00 

CC«AcS2 ATTTCGAAGT ATTCCK& CGTGATG^C ACCTCGATAT GTGCATCTGT 

iMll .eiA 4550 4560 

aaaa«KS gttccaSS ccagggJgS tctcttSSa gccttatgca gttgctctcc 

lM0 4610 4620 

«c«m3 tcctctmS mar™ atttcttStt t C «t»t« 

unffi? t™3S naaS? an«SS .mniS uuJR 
cmJS «totS?? <aa£S —mJSS »™SS ««SS 

1.780 4790 4S00 

«CT««C? CCTAAGCCAG TMlS ««TcS« ««*T.T. CTTCTaTC. 

«tcc£S ««atU? «ru««c «T C T 0 SS OCM^S 

jaaa 49t0 4920 

«„««SS .«CuSS ««««tS «WM*K« WTCKTTGC TCCTOOTT 

ncmSS tiunSS imuSS area™ catiJS? omS 

ATTGACTGGA ACACCtSKS CCCToJSS WCClSS T«T«S3 TC«««S 
TC«CC?S! »CA«T.Sfc CCHmSS MCalffl! TOAT..53 

«™5S..«oSf M.mSS w«S! »««'™ 

a£! ctcaaJS? ««3S «c«SS *™"" G 
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5230 5248 5250 5260 ... K71 . 5280 



gcgatg*?? ctgccgtcK aaatattgaa acgttccatt ccgcgcagac gctaccgcct 
gtagatg?gg" aaacccaIc? aggcgtgcg? tgccagcJgg ttacgcgtcc cctctcSS 

GTCGGTCTGT ATATKcS C«CTC«S CCCCTaS CAACGGTCCT GATGCTGGCG 
ACGCCGgSc GCATTGcSg ATGCCAgS? GTGGTTcJS GCTCGCcSc! GCCCATcJS 
GATGAAATCC TCTATGCGGC GCUClSS GGCGTGCAGG AAATCTTTAA CGTCGGCGGC 

gcgcaggcg! ttgccgcS? ggccttcSc agcgag"c!g taccgaaagt ggataaH?? 
.tggccc 9 « gcaacgcctt tgtaac'cS! gccaaacS? aggtcagS! gcgtctcSc 
ggcgcggc?! tcoataSS agccggg™ tctgaag?;? tggtgakgc AGACAGCGGC 

GCAACAC™ ATTTCgTcIc TTCTGA 5 C?TG CTCTcS CTGA6CACGG CCCGGA™ 

caggtJ™ tgctgac™ tgatgc?™ attgcccgS aggtggcgg! ggcggtaIS 

CGTCAACTGG CGGAACTGCC GCGCGCgSc ACCGCCCgS AGGCCC?Sg CGCCAG?" 

5920 5939 5946 

CTGATTGTGA CCAAAGATTT AGCGCaSS GTCGCCATCT CTAATCAGTA TGGGCCGGAA 

3990 6000 

.ACTTAATCA TCCAGAcScI CAATGCGCGC GATTTGg'tgg" ATGCGATTAC CAGCGCAGGC 

TCG6TATTTC TCKCsSS GTCGCCGCAA TCCGCCgIS ATTACgSS CGCAACCAAC 

CATGTTTTAC CCtCaSS CTATACT^CT KCTClSS GCCTTGGGTT AGCGGA^f 

CAGAAACGGA TCACCcSS GGAA^ AAAGCgSS TTTCCGCk! GGCATCAACC 

ATTGAAACAT TGGCGgS!? AGAACGTCTG ACCGCcS" AiUATcSR aCCClSS 

a 7 it ci 6290 6399 

GTAAACGCCC TCAAGGAGCA AGCATGaTg! ACTGAAa'JcA CTCTCAGCGT CGCTGACTTA 

gcccgtSI! atctcc 6 gS! cctggaS?? cagacatgat aagataS?? gatgagtttg 

GACAAACCAC AACTAgK" CAGTCAAAAA AATGaSE TTGTGAaIS TGTGATgS! 

TTG CTTTATT TGTAACCATT ATAAGCTGCA ATAAACAAGT TAACAACAAC AATTGCATTC 
6490 6500 6510 6520 6530 6540 
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ATTTTATGTT TCAGGTTCAG GGGGAGGTGT GGGAGGTTTT TTAAAGCAAG TAAAACCTCT 

«ca ft^fifl 657B 6580 6596 -6600 

ACAAATGTGG TATGGCTGAT TATGATCTCT AGGGCCGGCC CTCGACGGCG CGCCTCTAGA 

,,. fl cc,« fi fi3a 6640 6650 6660 

GCAGTGTGGT TTTGCAAGAG GAAGulSi GCCTCTCCAC CCAGGCCTGG AATGTTTCCA 

6670 6688 669© 6789 *™ f.J» 

CCCAATGTCG AGCAGTGTGG TTTTGCAAGA GGAAGCAAAA AGCCTCTCCA CCCAGGCCTG 

ftTxa 675a 6768 6778 6788 

fluram! acccaat™ GAGCAAACCC CGCCCAGCGT cttgtcattg gcgaattcga 

c-raa cana fiAifl 6820 6830 6840 

acacgcSS? cacTtSS cggcgcgS! ccaggtccac ttcgcatatt aaggtgacgc 

68S8 6868 6878 6888 6898 6909 

CT GTG6CCTC GAACACCGAG CGACCCTGCA GCCAATATGG GATCGGCCAT TGAACAAGAT 

6918 6929 6938 6948 6959 6960 

GGATTGCACG CAGGTTCTCC GGCCGCTTGG GTGGAGAGGC TATTCGGCTA TGACTGGGCA 

6970 6980 6999 7908 7818 7929 

CAACAGACAA TCGGCT6CTC TGATGCCGCC GTGTTCCGGC TGTCAGCGCA GGGGCGCCCG 

7B3B 7048 7958 7969 7979 7888 

GTTCTTTTTG TCAAGACCGA CCTGTCCGGT GCCCTGAATG AACTGCAGGT AAGTGCGGCC 

7B4B 7188 7118 7128 7138 7148 

GTCGATGGCC GAGGCG6CCT C6GCCTCTGC ATAAATAAAA AAAATTAGTC AGCCATGCAT 

715B 7168 7179 7189 7199 7289 

GGGGCGGAGA ATG6GCGGAA CTGGGCGGAG TTAGGGGCGG GATGGGCGGA GTTAGGGGCG 

7»ia T77« 7238 7249 7259 7269 

"»CTAT§? TGCTGACTAA TTGAGATGCA TGCTTTGCAT ACTTCTGCCT GCTGGGGAGC 

7279 7288 7299 73 HB 731 9 7320 

CTGGGGACTT TCCAaCCTG GTTGCTGACT AATTGAGATG CATGCTTTCC ATACTTCTGC 

7339 7348 7359 7369 7379 7380 

CTGCTGGGGA GCCTGG6GAC TTTCCACACC CTAACTGACA CACATTCCAC AGAATTAATT 

7398 7488 7419 7429 7439 7449 

CCCCTAGTTA TTAATAGTAA TCAATTACGG GGTCATTAGT TCATAGCCCA TATATGGAGT 

74SB 7468 7479 7489 7499 7588 

TCCGC6TTAC ATAACTTACG GTAAATGGCC CGCCTGGCTG ACCGCCCAAC GACCCCCGCC 

751B 7528 7538 7548 7558 7560 

CATTGACGTC AATAATGACG TATGTTCCCA TAGTAACGCC AATA6GGACT TTCCATTGAC 

7578 7588 7598 7686 7618 7620 

GTCAATGGGT GGACTATTTA CGGTAAACTG CCCACTTGGC ACTA CAT CAA GTGTATCATA 

7638 7648 7658 7668 7670 7680 

TGCCAAGTAC GCCCCCTATT GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC 

7698 7788 7718 7720 7730 7749 

ACTACATGAC CTTATGGGAC TTTCCTACTT GGCAGTACAT CTACGTATTA GTCATCGaA 

7758 7768 7778 7780 7790 7809 

TTACCATGGT GATGCG6TTT TGGCAGTACA TCAATGGGCG TGGATA6CGG TTTGACTCAC 
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7810 7820 7830 7840 CAC 

GGGGATTTCC AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC 

7»7fl 7880 7890 7900 7 910 7920 

aacgg JS tccaaaItg? cgtmoSct ccgccccatt gacgcaaatc ggcggtaggc 

7Qlfl 7<U0 7950 7960 7970 7980 

gtgtacgct! ggaggtSI? ataagcaSg CTGGGTACGT GAACCGTCAG ATCGCCTGGA 

7QQ» 8AA0 8010 8020 8830 

GACGCCATCA CASAKKK ACTATGgSK TTCAGGTGCA GATTATCAGC TTCCTGCTAA 

saca JtafiO 8070 8080 8090 8100 

TCAGTGCTTC AGTCATAATG TCCAGAGgE AAATTGTTCT CTCCCAGTCT CCAGCAATCC 

8110 8129 8138 8140 **** 8168 



tgtctgSS tccaggggIg aaggtcacaa tgaotgcag ggccagctca AGTGTAAGTT 

.... ...» gion 8288 8210 8220 

atccIc" gttccagXg aagccagE? cctcccccaa accctggatt tatgccacat 

*7AB 8258 8268 8270 8288 

£ caacc"g? ttctggagt? cctgttcgct tcagtggcag tgggtctggg acttcttact 

A?aa jiuia 8310 8320 8330 8340 

ctctcaS? cagcagJSg gaggct"" atgctgccac ttattactgc cagcagtgga 

o, fift a3 7a 8380 8390 8400 

ctagta"?? acccacgS? ggagggS ccaagctgga aatcaaacgt acggtggctg 
caccatJS? crrcATm! ccgcca^S atgagcIIS? gaaatctgga actgcctctg 

QA-yn 9 a** 8AQO 8S00 8510 8520 

ttgtgtg™ gctgaatII! ttctatccca gagaggccaa agtacagtgg aaggtggata 

fl „ ft tt -, fl occa 3560 8570 8580 

-GCCCTcS ATCGGGTAAC TCCCAGGAGA GTGTCACAGA GCAGGACAGC AAG6ACAGCA 

cctacag??? cagcagSJ ctgacgc?" gcaaagSg! ctacgagH! cacaaagtct 

flCCfl a(iflo 9 fi 7Q 8680 8690 8700 

acccctHS agtcac'cK? cagggcctX gctcgcccgt cacaaagagc ttcaacaggg 

8750 8760 

GA6AGTGTTG AATTCAGATC CGTTAa'cS? TACCAaSS CTAGACTGGA TTCGTGACAA 

ana ••na A7oa 8800 8810 8820 

CATGCGGCCG TGATATcSc GTATGATCAG CCTCGACTGT GCCTTCTAGT TGCCAGCCAT 

s«?a ina tica 8860 8870 8880 

CTGTTGTTTG CCCCTcSS CTKCtSct TGACCCTGGA AGGTGCCACT CCCACTGTCC 

ocaa aaaa ftQia 8920 8930 8940 

TTTCCTAATA AAATGAGGAA Amai^ ATTGTCTGAG TAGGTGTCAT TCTATTCTGG 

8480 8990 9000 

GGGGTGGGGT GGGGCAGGAC AGCAAGGGGG AGGATTGGGA AGACAATAGC AGGCATGCTG 

AAAA 9050 9060 

GGGATGCGGT GGGCTctSJ GAACCaJct! GGGCTCGACA GCTATGCCAA GTACGCCCCC 

tattga'cE? aatgacgS5 aatggc 9 c?g! ctggcaS!? gcccag?"c! tgaccttatg 
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9139 9140 9150 9160 9170 . 9180 

GGACTTTCCT ACTTGGCAGT ACATCTACGT ATTAGTCATC GCTATTACCA TGGTGATGCG 

9190 9200 9Z10 9220 9230 9240 

GTTTTGGCAG TACATaATG GGCGTGGATA GCGGTTTGAC TCACGGG6AT TTCCAAGTCT 

9250 9260 9270 9280 9290 9300 

CCACCCCATT GACGTCAATG GGAGTTTGTT TTGGCACCAA AATCAACGGG ACTTTCCAAA 

9310 9320 9330 9340 9350 9360 

ATGTCGTAAC AACTCCGCCC CATTGACGCA AATGGGCGGT AGGCGTGTAC GGTGGGAGGT 

9 370 9380 9 390 9400 9410 9420 

CTATATAAGC AGAGCT66GT ACGTCCTCAC ATTCAGTGAT CACCACTGAA CACAGACCCG 

9430 9440 9450 9460 9470 9480 

TCGACATGGG TTGGAGCCTC ATCTTGCTCT TCCTTGTC6C TGTTCCTACG CGTGTCCTGT 

9490 9500 9510 9520 9530 9540 

CCCAGGTACA ACTGCAGCAG CCTGGGGCTG AGCTGGT6AA GCCT6GGGCC TCAGTCAAGA 

9550 9560 9570 9580 9590 9600 

TGTCCTGCAA GGCTTCTGGC TACACATTTA CCAGTTACAA TATGCACTGG GTAAAACAGA 

9610 9620 9630 9640 9650 9660 

CACCTGGTCG GGGCCTGGAA TGGATTGGAG CTATTTATCC CGGAAATGGT GATACTTCCT 

9670 9680 9690 9790 9710 9720 

ACAATCAGAA GTTCAAACGC AAGGCCACAT TGACTGCAGA CAAATCCTCC A6CACAGCCT 

9730 9740 9750 9760 9770 9780 

ACATGCAGCT CAGCAGCCTG ACATCTGAGG ACTCTGCGGT CTATTACTGT GCAA6ATCGA 

9790 9800 9810 9820 9830 9840 

CTTACTACGG CGGTGACTGG TACTTCAATG TCTGGGGCGC AGG6ACCACG GTCACCGTCT 

9850 9860 9870 9880 9890 9900 

CTGCAGCTA6 CACCAAGGCC CCATCGGTCT TCCCCCTGGC ACCCTCCTCC AAGAGCACCT 

9910 9920 9930 9940 9950 9960 

CTGGGGGCAC AGCGGCCCTG GGCTGCCTGG TCAA6GACTA CTTCCCCGAA CCGGTGACGG 

9970 9980 9990 19999 19919 19920 

TGTCGTGGAA CTCAGCCGCC CTGACCAGCG GCGTGCACAC CTTCCCGGCT GTCCTACAGT 

10030 10940 10059 19969 19979 19980 

CCTCAGGACT CTACTCCCTC AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC 

10090 10100 10110 19120 19139 19140 

AOACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAA6CTGGAC AAGAAAGCAG 

10150 19160 10170 101S9 19199 19200 

AGCCCAAATC 7TGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG 

10210 10220 10230 19240 10250 10260 

GGGGACCGTC AGTCTTCCTC TTCCCCCCAA AACCCAAGGA CACCCTCATG ATCTCCCGGA 

10270 10280 10290 10309 19310 10320 

CCCCTGAGGT CACATGCGTG GTCGTGGACG TGAGCCACGA AGACCCTCAG GTCAAGTTCA 

10330 10340 10350 10360 10370 10380 

ACTGGTACGT GCACGGCCTG GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT 

10390 10490 10410 10420 10430 10449 
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ACAACAGCAC GTACCGTGTG GTCACCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATG 

10450 10460 10470 104«0 10490 10500 

GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAG A AAA CCA 

10510 10520 10530 10540 10550 10560 

TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTA CACCCTGCCC CCATCCCGGG 

10570 10580 10590 10600 10610 1G620 

ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TATCCCAGCG 

10630 10640 10650 10660 10670 10680 

ACATCGCCGT GGAGTGGGAG AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACCCCTC 

10690 10700 10710 10720 10730 10740 

CCGTGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA 

10750 10760 10770 10780 10790 10800 

r VTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGA6GCTCTG CACAACCACT 

10810 10820 10830 10840 10850 10860 

ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAATGAG6 ATCCGTTAAC GGTTACCAAC 

10870 10880 10890 10900 10910 10920 

TACCTAGACT GGATTCGTGA CAACATGCGG CCGTGATATC TACGTATGAT CAGCCTCGAC 

10930 10940 10950 10960 10970 10980 

TGTCCCTTCT AGTTGCCAGC CATCTGTTGT TTGCCCCTCC CCCGTGCCTT CCTTGACCCT 

10990 11000 11010 11020 U030 J£**? 

GGAAGGTGCC ACTCCCACTG TCCTTTCCTA ATAAAATGAG GAAATTGCAT CGCATTGTCT 

11050 U060 11070 U080 11090 11100 

GAGTAGGTGT CATTCTATTC TGGGGGGTGG GGTGGGGCAG GACAGCAAGG GGGAGGATTG 

11110 11120 11130 11140 11150 11160 

"'SAAGACAAT AGCAGGCATG CTGGGGATGC GGTGGGCTCT ATGGAACCAG CTGGGGCTCG 

11170 U180 11190 11200 H210 11220 

ACAGCAACGC TAGGTCGAGG CCGCTACTAA CTCTCTCCTC CCTCCTTTTT CCTGCAGGAC 

11230 11240 U250 11260 11270 11280 

GAGGCAGCGC GGCTATCCTG GCTG6CCACG ACGG6CGTTC CTTGCGCAGC TGTGCTCGAC 

11290 11300 11310 H320 11330 ^ 1JJ4J 

GTTCTCACTG AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGT6CCGGG GCAGGATCTC 

11350 11360 11370 11380 11390 11400 

CTGTCATCTC ACCTTGCTCC TCCCGAGAAA GTATCCATCA TG6CT6ATGC AATGCGGCGG 

U410 11420 11430 11440 U450 U460 

CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC AAGCGAAACA TCG CATC GAG 

11470 11480 U490 11500 11510 U520 

CCAGCACGTA CTCCGATGGA AGCCGGTCTT GTCGATCAGG ATGATCTG6A CGAAGAGCAT 

11530 11540 11550 11560 11570 l***? 

CAGGGCCTCG CGCCAGCCCA ACTGTTCGCC AGGTAAGTGA 6CTCCAATTC AAGCTTCCTA 

U590 11600 11610 11620 11630 11640 

GGGCCGCCAG CTAGTAGCTT TGCTTCTCAA TTTCTTATTT GCATAATGAG AAAAAAAGGA 

11650 11660 11670 11680 11690 11700 

AAATTAATTT TAACACCAAT TCAGTAGTTG ATTGAGCAAA TGCGTTGCCA AAAAGGATGC 
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11710 11720 11730 11740 117S0 11760 

TTTAGAGACA GTGTTCTCTG CACAGATAAG GACAAACATT ATTCAGAGGC AGTACCCAGA 

11770 U780 11790 11800 11810 11820 

GCTGAGA CTC CTAAGCCAGT GAGTGGCACA GCATCCAGGG AGAAATATGC TTGTCATCAC 

11830 11840 11850 11860 11870 11880 

CGAACCCTGA TTCCGTAGAG CCACACCCTG GTAAGGGCCA ATCTGCTCAC ACAGGATACA 

11890 11900 11910 11920 U930 11940 

GAGGGCAGGA GCCAGGCCAG AGCATATAAG GTGACGTAGG ATCAGTTGCT CCTCACATTT 

11950 11960 11970 11980 11990 12000 

GCTTCTGACA TAGTTGTGTT GGGAGCTTGG ATAGCTTGGG GGGGGGACAG CTCAGGGCTG 

12010 12020 12030 12040 12050 12060 

CGATTTCGCG CCAAACTTGA CGGCAATCCT AGCGTGAAGG CTGGTAGGAT TTTATCCCCG 

12070 12080 12090 12100 12110 %w J££i 

GCCATCAT GGTTCGACCA TTGAACTGCA TCGTCGCCGT GTCCCAAAAT ATGGGGATTG 

12130 12140 12150 12160 12170 12180 

.GCAAGAACGG AGACCTACCC TGGCCTCCGC TCAG6AACGA GTTCAAGTAC TTCCAAAGAA 

12190 12200 12210 12220 12230 12240 

TGACCACAAC CTCTTCAGTG GAAGGTAAAC AGAATCTGGT GATTATGGGT AGGAAAACCT 

12250 12260 12270 12280 12290 U30« 

GCTTCTCCAT TCCTGAGAAG AATCGACCTT TAAAGGACAG AATTAATATA GTTCTCAGTA 

12310 12320 12330 12340 12350 12360 

GAGAACTCAA AGAACCACCA C GAG GAG CTC ATTTTCTTGC CAAAAGTTTG GATGATGCCT 

12370 12380 12390 12400 12410 12«0 

TAAGACTTAT TGAACAACCG GAATTGGCAA GTAAAGTAGA CATGGTTTGG ATAGTCGGAG 

12430 12440 12450 12460 12470 12480 

AGTTCTGT TTACCAGGAA GCCATGAATC AACCAGGCCA CCTCAGACTC TTTGTGACAA 

12490 12500 12510 12520 12530 12540 

GCATCATGCA GGAATTTGAA AGTGACACGT TTTTCCCAGA AATTGATTTG GGCAAATATA 

12550 12560 12570 12580 12590 12600 

AACTTCTCCC AGAATACCCA GGCGTCCTCT CTGAGGTCCA GGAGGAAAAA GGCATCAAGT 

12610 12620 12630 12640 12650 12660 

ATAAGTTTGA AGTCTACGAG AAGAAAGACT AACAGGAAGA TGCTTTCAAG TTCTCTGCTC 

12670 12680 12690 12700 12710 I*™ 

CCCTCCTAAA GCTATGCATT TTTATAAGAC CATGGGACTT TTGCTGGCTT TAGATCAGCC 

12730 12740 12750 12760 12770 I*™ 

TCGACTGTGC CTTCTAGTTG CCAGCCATCT GTTGII I GCC CCTCCCCCGT GCCTTCCTTG 

12790 12800 12810 12820 12830 12^« 

ACCCTGGAAG GTGCCACTCC CACTGTCCTT TCCTAATAAA ATGAGGAAAT TGCATCGCAT 

12850 12860 12870 12880 12890 12900 

TGTCTGAGTA GGTGTCATTC TATTCTGGGG GGTCGGGTGG GGCAGGACAG CAAGGGGGAG 

12910 12920 12930 12940 12950 _ IgJJ 

GATTGGGAAG ACAATAGCAG GCATGCTGGG GATGCGGTGG GCTCTATGGC TTCTGAGGCG 

12970 12980 12990 13000 13010 

GAAAGAACCA GCTCGGGCTC GAAGCGCCCG CCCATTTCGC TGGTGGTCAG ATGCGGGATG 
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13030 13040 130S0 13060 13070 

GCGTGGGACG CGGCGGG6AG CGTCACACT6 AG6TTTTCCG CCAGACGCCA CTGCTGCCAG 

13090 13100 13110 13120 13130 13140 

GCGCTGATGT GCCCGGCTTC TGACCATGCG GTCGCGTTCG GTTGCACTAC GCGTACTGTG 

131S0 13160 13170 13180 13190 13200 

AGCCAGAGTT GCCCGGCGCT CTCCGGCTGC GGTAGTTCAG GCAGTTCAAT CAACTGTTTA 

13210 13220 13230 13240 13250 13260 

CCTTGTGGAG CGACATCCAG AGGCACTTCA CCGCTTGCCA GCGGCTTACC ATCUGCGCC 

13270 13280 13290 13300 13310 13320 

ACCATCCAGT GCAGGAGCTC GTTATCGCTA TGACGGAACA GGTATTCGCT GGTCACTTCG 

13330 13340 13350 13360 1337 0 13380 

ATGGTTTGCC CGGATAAACG GAACTGGAAA AACTGCTGCT GGTGTTTTGC TTCCGTCAGC 

mo«i iiiM 13410 13420 13430 13440 

l . GGAT6CG 6CGTCCGGTC GGCAAAGACC AGACCGTTCA TACA6AACTG 5C6ATCGTTC 

13450 13460 13470 13480 13490 13500 

G6C6TATCGC CAAAATCACC GCCGTAA6CC GACCACGGGT TGCCGTTTTC ATCATATTTA 

13510 13520 13530 13S40 13550 13560 

ATCAGCGACT GATCCACCCA GTCCCAGACG AAGCCGCCCT GTAAACGGGG ATACTGACGA 

13570 13580 13590 13600 13610 13620 

AACGCCTGCC AGTATTTA6C GAAACCGCCA AGACTGTTAC CCATCGCGTG GGCGTATTCG 

13630 13640 13650 13660 136 78 13680 

CAAAGGATCA GCGGGCGCGT CTCTCCAGGT AGCGAAAGCC ATTTTTTGAT GGACCATTTC 

13690 13700 13710 13720 13730 13740 

GGCACAGCCG GGAAGGGCTG GTCTTCATCC ACGCGCGCGT ACATCGGGCA AATAATATCG 

137S0 13760 13770 13780 13790 13800 

x> ■ GGCCGTGG TGTCGGCTCC GCCGCCTTCA TACTGCACCG GGCGGGAA6G ATCGACAGAT 

13810 13820 13830 13840 13850 13860 

TTGATCCAGC GATACAGCGC GTCGTGATTA GCGCCGTGGC CTGATTCATT CCCCAGCGAC 

13870 13880 13890 13900 13910 ,^3920 

CAGATGATCA CACTCGGGTG ATTACGATCG CGCTGCACCA TTCGCGTTAC GCGTTCGCTC 

13930 13940 13950 13960 13970 13980 

ATCGCCGGTA GCCAGCGCGG ATCATCGGTC AGACGATTCA TTGCCACCAT GCCGT6GGTT 

13990 14000 14010 14020 14030 14040 

TCAATATTGG CTTCATCCAC CACATACAGG CCGTAGCGGT CGCACAGCGT GTACCACAGC 

14050 14060 14070 14080 14090 14100 

GGATGGTTCG 6ATAATGCGA ACAGCGCACG GCGTTAAAGT TGTTCTGCTT CATCAGCAGG 

14110 14120 14130 14140 14150 14160 

ATATCCTGCA CCATCGTCTG CTCATCCATG ACCT6ACCAT 6CAGAGGATG ATGCTCGTGA 

14170 14180 14190 14200 14210 ^l** 2 * 

CGGTTAACGC CTCGAATCAG CAACGGCTTG CCGTTCAGCA GCAGCAGACC ATTTTCAATC 

14230 14240 14250 14260 „ 

CGCACCTCGC GGAAACCGAC ATCGCAGGCT TCTGCTTCAA TCACC6TGCC GTCGGCGGTG 
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TGCAGTTCAA CCACCGCACO ATAGAGATTC G6CATTTCGG CGCTCCACAG TTTCG6CTTT 

14358 14360 14370 14380 l*f90 

TC6ACGTTCA GACGTAGTGT GACGCGATCG GCATAACCAC CACGCTCATC 6ATAATTTCA 

14410 14420 14430 14440 14450 14460 

CCGCCGAAAG GCGCGCTGCC GCTGGCGACC TGCGTTTCAC CCTGCCATAA AGAAACTGTT 

14470 144*0 14490 14500 14510 14520 

ACCCGTAGGT AGTCACGCAA CTCGCCGCAC ATCTGAACTT CAGCCTCCAG TACAGCGCGG 

14530 14540 14550 14560 14570 14580 

CTGAAATCAT CATTAAAGCG AGTGGCAACA TGGAAATCGC TGATTTGTGT AGTCGGTTTA 

14590 14600 14610 14620 14630 14640 

TGCA6CAACG A6ACGTCACG GAAAATGCCG CTCATCCGCC ACATATCCTG ATCTTCaGA 

14650 14660 14670 14680 "TOO 

TAACTGCCGT CACTCCAGCG CAGCACCATC ACCGC6AGGC CGTTTTCTCC GGC6CGTAAA 

14710 14720 14730 14740 14750 14760 

AATGCGCTCA GGTCAAATTC A6ACGGCAAA CGACTGTCCT GGCC6TAACC GACCCAGCGC 

14770 14780 14790 14800 14810 14820 

CCGTTGCACC ACAGATGAAA CGCCGAGTTA ACCCCATCAA AAATAATTCG CGTCTGGCCT 

14830 14840 14850 14860 14870 14880 

TCCTGTAGCC AGCTTTCATC AACATTAAAT GTGAGCGAGT AACAACCCGT CGGATTCTCC 

14890 14900 14910 14920 

GTGGGAACAA ACGGCGGATT GACCGTAATG GGATAGGTGA CGTTGGTGTA GATGGGCGCA 

14950 14960 14970 14980 14990 15000 

TCGTAACCGT GCATCTGCCA GTTTGAG6GG ACGACGACAG TATCGGCCTC AGGAAGATCG 

15010 15020 15030 15040 15050 15060 

'•ACTCCAGCC AGCTTTCCGG CACCGCTTCT GGTGCCGGAA ACCAGGCAAA GCGCCATTCG 

15070 15080 1S090 15100 15110 "g{ 

CCATTCAGGC TGCGCAACTG TTGG6AAGGG CGATCGGTGC GGGCCTCTTC GCTATTACGC 

15130 15140 15150 15160 15170 151» 8 

CAGCTG6CGA AAGGG6GATG TGCTGCAAGG CGATTAAGTT 6GGTAACGCC A6GGTTTTCC 

15190 15200 15210 15220 15230 ,.,"240 

CAGTCACGAC GTTGTAAAAC GACTTAATCC GTCGAGGGGC TGCCTCGAA6 CAGACGACCT 

15250 15260 15270 15280 1» .muSIa 

TCCGTTGTGC AGCCAGCGGC GCCTGCGCCG GTGCCCACAA TCGTCCGCGA ACAAACTAAA 

15310 15320 15330 15340 15350 15360 

CCAGAACAAA TTATACCGGC GCCACCGCCG CCACCACCTT aCCCGTGCC TAACATTCCA 

15370 15380 15390 15400 15410 ,."*20 

GCGCCTCCAC CACCACCACC ACCATCGATG TCTGAATTGC CGCCCGCTCC ACCAATGCCG 

15430 15440 15450 15460 15470 15480 

AC6GAACCTC AACCCGCTGC ACCTTTAGAC GACAGACAAC AATTGTTG6A AGCTATTAGA 

15490 15500 15510 15520 15530 15540 

AACGAAAAAA ATCGCACTCG TCTCA6ACCG GTCAAACCAA AAACGGCGCC CGAAACCAGT 

155S0 1SS60 15570 15580 15590 g«g 

ACAATAGTTG AGGTGCCGAC TGTGTTCCCT AAA GAGA CAT TT6AGCCTAA ACCGCCGTCT 



WO 98/41645 PCT/US98/03935 

33 / 51 

ONASIS 
Molly Lark 

««SS acaSS «e K SS5 cccccSSS cc«SS <c«cS3 

«™2S ™»iss ™«ga ««ass ™-^s — ? 

ttkJSTc cmgS «m£2 <m£3! -™2R 

^ss; « 5 ccSis ««css; ™»ss rrcojss; o»ss 

UnJKS c««SS . t CcS «mSS «»S5 «T«g2! 
KfltSSS T«cS ««K«rS T««SS! 

.«srj cacccS?^ «»«aa ^sss mjss 

mm&s c^Tr crccis?! CCuSS CUO^S 

'nt»SS «nSS <«c&g cnniffiS «*S5 »«aJSS 

CCTTfiCCTTA cro£S CUT^S ctalSS UcJSR ««JSTc 

™e»aa ™nas ™«ffis *«jbs tc^ss 

t^SJ «»SS ™««SSS «raS» ™»«8 

arcSg? KcnSS loJSS! «a«SK nuJSS owSS 
tccnSS tttttctctc «»«S c««£S KcaSSS 

csttccacta .cccIS?! TranffiS «t«c£K tcgtcacSt cooSS 

TTCATC66TA TCnSS .nJSS »™ UK^S .Ta^Tc'c 

ificaa 16610 16620 

«*£S ««c2S? OCIlfflS «*«««C .TT„«CTT 

rrcaauc iuuSS <mc£5 urnSS »™! 

««aS£ „»£S cc«£S crcaSS? lanffiS "ctJSTc 

1C78A 16790 16S00 

rrcruuu TC« t S3? wJSS «a«SS? tmiw. ««««« 
muSS awSS cn«££ a«£S «»5S5 

MA icaAA 16910 16920 

anc&S ccaiiSS .«™ noiSS t««c*tc aaaam 
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16930 16940 16950 16960 16970 16980 

TACTGAGAGT GCACCATATG CGGTCTGAAA TACCGCACAG ATGCGTAAGG AGAAAATACC 

16990 17000 17010 17020 17030 17040 

GCATCAGCCC CTCTTCCGCT TCCTCGCTCA CTGACTCGCT GCGCTCGGTC GTTCGGCTGC 

17050 17060 17070 17080 17090 17100 

GGCGAGCGGT ATCAGCTCAC TCAAAGGCGG TAATACGGTT ATCCACAGAA TCAGGGGATA 

17110 17120 17130 17140 17150 17160 

ACGCAGGAAA GAACATGTGA GCAAAAGGCC AGCAAAAGGC CAGGAACCGT AAAAAGGCCG 

17170 17180 17190 17200 17210 17220 

CGTTGCTGGC G 1 I i 1 1 CCA T AGGCTCCGCC CCCCTGACGA GCATCACAAA AATCGACGCT 

17230 17240 17250 17260 17270 17280 

CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA CCAGGCGTTT CCCCCTGGAA 

17290 17300 17310 17320 17330 17340 

GCTCCCTCGT GCGCTCTCCT GTTCCGACCC TGCCGCTTAC CGGATACCT6 TCCGCOTTC 

17350 17360 17370 17380 17390 17400 

"TCCCTTCGGG AAGCGTGGCG CTTTCTCATA CCTCACGCTG TAGGTATCTC AGTTCGGTGT 

17410 17420 17430 17440 17450 17460 

AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC CGTTCAGCCC GACCGCTGCG 

17470 17480 17499 17500 17510 17520 

CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG ACACGACTTA TCGCCACTGC 

17530 17540 17550 17560 17570 17580 

CAGCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT AGGCGGTGCT ACAGAGTTCT 

17590 17600 17610 17620 17630 17640 

TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT ATTTGGTATC TGCGCTCTGC 

17650 17660 17670 17680 17690 17700 

TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA CAAACCACCG 

17710 17720 17730 17740 17750 17760 

CTGGTAGCCG TG I 1 1 1 11 11 GTTTGCAAGC AGCAGATTAC GCGCAGAAAA AAAGGATCTC 

17770 17780 17790 17800 17810 17820 

AA6AAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA GTGGAAC6AA AACTCACGTT 

1 7830 17840 17850 17860 17870 17880 

AAGGGATTTT GGTCATGAGA TTATCAAAAA GGATCTTCAC CTAGATCCTT TTAAATTAAA 

17890 17900 17910 17920 17930 17940 

AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC TTGGTCTGAC AGTTACCAAT 

17950 17960 17970 17980 17990 18000 

GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT TCGTTCATCC ATAGTTGCCT 

18010 18020 18030 18040 18050 18060 

GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC CCCAGTGCTG 

18070 18080 18090 18100 18110 18120 

CAATGATACC GC GAGA C CCA CGCTCACCGG CTCCAGATTT ATCAGCAATA AACCAGCCAG 

18130 18140 18150 18160 18170 18180 

CCGGAAGCGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC CGCCTCCATC CAGTCTATTA 

18190 U200 18210 18220 18230 18240 
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ATTGTTGCCG 


GGAAGCTAGA 


GTAAGTAGTT 

^B ■ W^W^ TB • w ^ ^B ■ ■ 


15Z50 


18260 


18270 


CCATTGCTGC 


AGGCATCGTG 


GTGTCACGCT 

* ^ ■ ^ ^P ^P ^P • 


13310 


18320 


18330 


GTTCCCAACG 


ATCAAGGCGA 


GTTACATGAT 


13370 


18380 


18390 


CCTTCGGTCC 


TCCGATCGTT 


GTCAGAAGTA 


18430 


18440 


18450 


TGGCAGCACT 


GCATAATTCT 

^mw^ w b^ b^ v ■ ^b t 


CTTACTGTCA 


18490 


18500 


18510 


GTGAGTACTC 


AACCAAGTCA 

*j»#^b^ *m 9 


TTCTGAGAAT 


18 550 


18560 


18570 


'^GCGTCAAC 


ACGGGATAAT 


ACCGCGCCAC 


18610 


18620 


18630 


GAAAACGTTC 


TTCGGGGCGA 


AAACTCTCAA 


• 

18670 


18680 


18690 


TGTAACCCAC 


TCGTGCACCC 


AACTGATCTT 


18730 


18740 


18750 


GGTGAGCAAA 


AACAGGAAGG 


CAAAATGCCG 


18790 


18800 


18810 


GTTGAATACT 


CATACTCTTC 


C I TTTTCAAT 


18850 


18860 


18870 


TCATGAGCGG 


ATACATATTT 


GAATGTATTT 


18910 


18920 


18930 


"MTTCCCCG 


AAAAGTGCCA 


CCTGACGTCT 


18970 


18980 


18990 


ATAAAAATAG 


GCGTATCACG 


AGGCCCTTTC 



CGCCAGTTAA 


TAGTTTGCGC 


AACGTTGTTG 


18280 


18290 


18300 


CGTCGTTTGG 


TATGGCTTCA 


TTCAGCTCCG 


18340 


18350 


18360 


CCCCCATGTT 


GTGCAAAAAA 


GCGGTTAGCT 


18400 


18410 


18420 


AGTTGGCCGC 


AGTGTTATCA 


CTCATGGTTA 


18460 


18470 


18480 


TGCCATCCGT 


AAGATGCTTT 


TCTGTGACTG 


18520 


18530 


18540 


AGTGTATGCG 


GCGACCGAGT 

^B P^ ^B ^BB^ ^B ■ 


TGCTCTTGCC 

W ^B V B> ■ ■ ^P ^BJ ^P 


18580 


18590 


18600 


ATAGCAGAAC 


TTTAAAAGTG 


CTCATCATTG 


18640 


18650 


18660 


GGATCTTACC 


GCTGTTGAGA 

V W 1 ^B 1 % *B#^ B 


TCCAGTTCGA 


18700 


18710 


18720 


CAGCATCTTT 


TACTTTCACC 


AGCGTTTCTG 


18760 


18770 


Bl BB BJbI BB. 

18780 


CAAAAAAGGG 


AATAAGGGCG 


ACACGGAAAT 


18820 


18830 


18840 


ATTATTGAAG 


CATTTATCAG 


GGTTATTGTC 


18880 


18890 


18900 


AGAAAAATAA 


ACAAATAGGG 


GTTCCGCGCA 


18940 


18950 


18960 


AAGAAACCAT 


TATTATCATG 


ACATTAACCT 


19000 


19010 


19020 


GTCTTCAAGA 
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Nt D ■ Inactive Dihydroxolate reductase SO - 3V40 Origin of replication 

X " CMV and SV40 enhancer* 

Ht H m inactive Samonella Hiatidinol Dehydrogenase 

T - Bexpes simplex thymidine kinas promo tar and polyoma tnh a nmtr 

C - Cytomogalovinxa promoter/ enhancer B - Bovine growth hormone polyadenylatxon 
HI - neomycin phosphotransferase axon 1 M2 • neomycin phosphotransferase exon 2 
K - Human kappa constant 31 - Human Oamma 1 constant 

VL » Variable light chain anti-CD23 primate 5E8 and leader 
VH - Variable heavy chain anti-CD23 primate 5B8N- and leader 

Mandy cut Xbal Xho I and llgatad to Xba I Xho I fragment from XKG1+CD23 5E8N-SHL 



Map by Mitchell Raff Conatructed by Koran MeLachtan 00V26V97 
Nonouttsrs a AfW, Avrll, Hlndlll, l-Ppol, I-S***l. Pmll, Rerll, Soft, Srft 
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10 20 30 40 50 60 

TTAATTAAGG GGCGGAGAAT GGGCGGAACT GGGCGGA GTT AGGGGCGGGA TGGGCGGAGT 

70 80 90 100 110 120 

TAGGGGCGGG ACTATGGTTG CTGACTAATT GAGATGCATG CTTTGCATAC TTCTGCCTGC 

130 140 150 160 170 180 

TGGGGAGCCT GGGGACTTTC CACACCTGGT TGCTGACTAA TTGAGATGCA TGCTTTCCAT 

190 200 210 220 230 240 

ACTTCTGCCT GCTGGGGAGC CTGGGGACTT TCCACACCCT AACTGACACA CATTCCACAG 

250 260 270 280 290 300 

AATTAATTCC CCTAGTTATT AATAGTAATC AATTACGGGG TCATTAGTTC ATAGCCCATA 

310 320 330 340 350 360 

TATGGAGTTC CGCGTTACAT AACTTACGGT AAATGGCCCG CCTGGCTGAC CGCCCAACGA 

370 380 390 400 410 420 

XCCGCCCA TTGACGTCAA TAATGACGTA TGTTCCCATA GTAACGCCAA TAGGGACTTT 

430 440 450 460 470 480 

CCATTGACGT CAATGGGTGG AGTATTTAC6 GTAAACTGCC CACTTGGCAG TACATCAAGT 

490 500 S10 520 530 540 

GTATCATATG CCAAGTACGC CCCCTATTGA CGTCAATGAC GGTAAATGGC CCGCCTGGCA 

550 560 570 580 590 600 

TTATGCCCAG TACATGACCT TATGGGACTT TCCTACTTGG CAGTACATCT ACGTATTAGT 

610 620 630 640 650 660 

CATCGCTATT ACCATGGTGA TGCGGTTTTG GCAGTACATC AATGGGCGTG GATAGCGGTT 

670 680 690 700 710 720 

TGACTCACGG GGATTTCCAA GTCTCCACCC CATTGACGTC AATGGGAGTT TGTTTTGAAG 

730 740 750 760 770 780 

,'GTTTAAAC AGCTTGGCCG GCCAGCTTTA TTTAACGTGT TTACGTCGAG TCAATTGTAC 

790 800 810 820 830 840 

ACTAACGACA GTGATGAAAG A A ATA CA AAA GCGCATAATA TTTTGAACGA CGTCGAACCT 

850 860 870 880 890 900 

TTATTACAAA ACAAAACACA AACGAATATC CACAAAGCTA GATTGCTGCT ACAAGATTTG 

910 920 930 940 950 960 

GCAAGTTTTG TGGCGTTGAG CGAAAATCCA TTAGATAGTC CAGCCATCGG TTCGGAAAAA 

970 980 990 1000 1010 1020 

CAACCCTTGT TTGAAACTAA TCGAAACCTA TTTTACAAAT CTATTGAGGA TTTAATATTT 

1030 1040 1050 1060 1070 1080 

AAATTCAGAT ATAAAGACGC TGAAAATCAT TTGATTTTCG CTCTAACATA CCACCCTAAA 

1090 1100 1110 1120 1130 11*0 

GATTATAAAT TTAAT6AATT ATTAAAATAC ATCAGCAACT ATATATTGAT AGACATTTCC 

1150 1160 1170 1180 1190 1200 

ACTTTGTGAT ATTAGTTTGT GCGTCTCATT ACAATGGCTG TTATTTTTAA CAACAAACAA 

1210 1220 1230 1240 1250 1260 

CTGCTCGCAG ACAATAGTAT AGAAAAGGGA GGTGAACTGT TTTTGTTTAA CGGTTCGTAC 



1270 1280 1290 1300 1310 1320 

AACATTTTGG AAAGTTATGT TAATCCGGTG CTGCTAAAAA ATGGTGTAAT TGAACTAGAA 
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1330 1340 1350 1360 1370 1380 

GAAGCTGCGT ACTATGCCGG CAACATATTG TACAAAACCG ACGATCCCAA ATTCATTGAT 

1390 1400 1410 1420 1430 1440 

TATATAAATT TAATAATTAA AGCAACACAC TCCGAAGAAC TACCA6AAAA TAGCACTGTT 

1450 1460 1470 1480 1490 1500 

GTAAATTACA GAAAAACTAT GCGCAGCGGT ACTATACACC CCATTAAAAA AGACATATAT 

1510 1520 1530 1540 1550 1560 

ATTTATGACA ACAAAAAATT TACTCTATAC GATAGATACA TATATGGATA CGATAATAAC 

1570 1580 1590 1600 1610 1620 

TATGTTAATT TTTATGAGGA GAAAAATGAA AAAGAGAAGG AATACGAAGA AGAAGACGAC 

1630 1640 1650 1660 1670 1680 

AAGGCGTCTA GTTTATGTGA AAATAAAATT ATATTGTCGC AAATTAACTG TGAATCATTT 

1690 1700 1710 1720 1730 1740 

GAAAATGATT TTAAATATTA CCTCAGCGAT TATAACTACG CGTTTTCAAT TATAGATAAT 

1750 1760 1770 1780 1790 1800 

ACTACAAATG TTCTTGTTGC GTTTGGTTTG TATCGTTAAT AAAAAACAAA TTTGACATTT 

1810 1820 1830 1840 1850 1860 

ATAATTGTTT TATTATTCAA TAATTACAAA TAGGATTGAG ACCCTTGCAG TTGCCA'GCAA 

1870 1880 1890 1900 1910 1920 

ACGGACAGAG CTTGTCGAGG AGAGTTGTTG ATTCATTGTT TGCCTCCCTG CTGCGGTTTT 

1930 1940 1950 1960 1970 1980 

TCACCGAAGT TCATGCCAGT CCAGCGTTTT TGCAGCACAA AAGCCGCCGA CTTCGGTTTG 

1990 2000 2010 2020 2030 2040 

CGGTCGCGAG TGAAGATCCC TTTCTTGTTA CCGCCAACGC GCAATATGCC TTGCGAGGTC 

2050 2060 2070 2080 2090 2100 

GCAAAATCGG CGAAATTCCA TACCTGTTCA CCGACGACGG CGCTGACGCG ATCAAAGACG 

2110 2120 2130 2140 2150 2160 

CGGTGATACA TATCCAGCCA TGCACACTGA TACTCTTCAC TCCACATGTC GGTGTACATT 

2170 2180 2190 2200 2210 2220 

GAGTGCAGCC CGGCTAACGT ATCCACGCCG TATTCGGTGA TGATAATCGG CTGATGCAGT 

2230 2240 2250 2260 2270 2280 

TTCTCCTGCC AGGCCAGAAG TTCTTTTTCC AGTACCTTCT CTGCCGTTTC CAAATCGCCG 

2290 2300 2310 2320 2330 2340 

CTTTG6ACAT ACCATCCCTA ATAACGGTTC AGGCACAGCA CATCAAAGAG ATCGCTGATG 

2350 2360 2370 2380 2390 2400 

GTATCGGTGT GAGCGTCGCA GAACATTACA TTGACGCAGG TGATCGGACG CGTCGGGTCG 

2410 2420 2430 2440 2450 2460 

AGTTTACCCG TTGCTTCCGC CAGTGGCGCG AAATATTCCC GTCCACCTTG CGGACGGGTA 

2470 2480 2490 2500 2510 2520 

TCCGGTTCGT TGGCAATACT CCACATCACC ACGCTTGCGT GGTTTTTGTC ACGCGCTATC 

2530 2540 2550 2560 2570 2580 

AGCTCTTTAA TCCCCTGTAA GTGCGCTTGC TCAGTTTCCC CGTTGACTGC CTCTTCGCTG 



2590 2600 2610 2620 2630 2640 
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TACAGTTCTT TCGGCTTGTT GCCCGCTTCG 

2650 2660 2670 

ACAGCAGCAG TTTCATCAAT CACCACGATG 

2710 2720 2730 

TCAGCGTAAG GGTAATGCGA GGTACGGTAG 

2770 2780 2790 

TGGTCGTGCA CCATCAGCAC GTTATCGAAT 

2830 2840 28S0 

CGACCAAAGC CAGTAAAGTA GAACGGTTTG 

2890 2900 2910 

GCCACTGACC GGATGCCGAC GCGAAGCGGG 

2950 2960 2970 

TGACGCACA GTTCATAGAG ATAACCTTCA 

3010 3020 3030 

TGCAAAGTCC CGCTAGTGCC TTGTCCAGTT 

3070 3080 3090 

TCAACGCTGA CATCACCATT GGCCACCACC 

3130 3140 3150 

TGCGCGACAT GCGTCACCAC GGTGATATCG 

3190 3200 3Z10 

ATTACGCTGC GATGGATTCC GGCATAGTTA 

3250 3260 3270 

TTGCCGTTTT CGTCGGTAAT CACCATTCCC 

3310 3320 3330 

TCACACAAA CGGTGATACC CCTCGACGGA 

3370 3380 3390 

AGTGTTCGTC TTCGTCCCAG TAAGCTATGT 

3430 3440 3450 

TCAAGGCGTT GGTCGCTTCC GCATTGTTTA 

3490 3500 3510 

CACATAATTC GCCTCTCTGA TTAACGCCCA 

3550 3560 3570 

TCGCTTCAAA AAATGGAACA ACTTTACCGA 

3610 3620 3630 

TAATCAGAAT AGCTGATGTA GTCTCAGTGA 

3670 3680 3690 

GGAAGCGTTT TGCAACCCCT TCCCCGACTT 

3730 3740 3750 

TTTCGTGTAA ATTAGATAAA TCGTATTTGT 

3790 3800 3810 

ATAGGGTTGG TACTAGCAAC GCACTTTGAA 

3850 3860 3870 

GCTCTTCTTC AAATCTATAC ATTAAGACGA 



AAACCAATGC 


CTAAAGAGAG 


GTTAAAGCCG 


2680 


2690 


2700 


CCATGTTCAT 


CTGCCCA GTC 


GACCATCTCT 


2740 


2750 


2760 


GAGTTGGCCC 


CAATCCAGTC 


CATTAATGCG 


2800 


2810 


2820 


CCTTTGCCAC 


GCAAGTCCGC 


ATCTTCATGA 


2860 


2870 


2880 


TGGTTAATCA 


GGAACTGTTC 


GCCCTTCACT 


2920 


2930 


2940 


TAGATATCAC 


ACTCTGTCTG 


GCTTTTGGCT 


2980 


2990 


3000 


CCCGGTTGCC 


AGAGGTGCGG 


ATTCACCACT 


3040 


3050 


3060 


GCAACCACCT 


GTTGATCCGC 


ATCACGCAGT 


3100 


3110 


3120 


TGCCAGTCAA 


CAGACGCGTG 


GTTACAGTCT 


3160 


3170 


3180 


TCCACCCAGG 


TGTTCGGCGT 


GGTGTAGAGC 


3220 


3230 


3240 


AAGAAATCAT 


GGAAGTAAGA 


CTGCI II 1 IC 


3280 


3290 


3300 


GGCGGGATAG 


TCTGCCAGTT 


CAGTTCGTTG 


3340 


3350 


3360 


TTAAAGACTT 


CAAGCGGTCA 


ACTATGAAGA 


3400 


3410 


3420 


CTCCAGAATG 


TAGCCATCCA 


TCCTTGTCAA 


3460 


3470 


3480 


CATAACCGGA 


CATAATCATA 


GGTCCTCTGA 


3520 


3530 


3540 


GCGTTTTCCC 


GGTATCCAGA 


TCCACAACCT 


3580 


3590 


3600 


CCGCGCCCGG 


TTTATCATCC 


CCCTCGGGTG 


3640 


3650 


3660 


GCCCATATCC 


TTGTCGTATC 


CCTGGAAGAT 


3700 


3710 


3720 


CTTTCGAAAG 


AGGTGCGCCC 


CCAGAAGCAA 


3760 


3770 


3780 


CAATCAGAGT 


CCTTTTGGCG 


AAGAATGAAA 


3829 


3830 


3840 


TTTTGTAATC 


CTGAAG££AT 


CGTAAAAACA 


3880 


3890 


3900 


CTCGAAATCC 


ACATATCAAA 


TATCCGAGTG 
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3910 


3920 


3930 


TAGTAAACAT 


TCCAAAACCG 


TGATGGAATG 


3970 


3980 


3990 


TGATTTGATT 


G CCA AAA ATA 


GGATCTCTGG 


4030 


4040 


4050 


GCGGAAGGGC 


CACACCCTTA 


GGTAACCCAG 


4090 


* 

4100 


4110 


AAAGGACTCT 


GGTACAAAAT 


CGTATTCATT 

V»WIMI 1 V>M 1 1 


4150 


4160 


4170 


CGTGTACATC 

i u m v>m i w 


GACTGAAATC 


CCTGGTAATC 


4210 


4220 


4230 


GATTATTGGT 

UM 1 ml 1 VIM 1 


MM llllllll 


GCACCTTCAA 

VIV>MV>VI 1 1 V.MM 


4270 


4280 


4290 


.CTACCGTA 


GGCTGCGAAA 

VI VI V» 1 VI \» VIMMM 


TGTTCATACT 

1 VI 1 1 ViM 1 M V. 1 


4330 


4340 


4350 


■GTTCGCGGGC 


GCAACTGCAA 

VI V»MM \» 1 VI WMM 


CTCCCATAAA 

V 1 V»V>VIM 1 MMM 


4390 


4400 


4410 


AAGAGAGTTT 


TCACTGCATA 


CGACGATTCT 

V. VIM V> VIM 1 1 V. 1 


4450 


4460 


4470 


MVI%> 1 1 V> 1 VIvW 


AACCGAACGG 

MM V. V» VMM V VI VI 


AC ATTTrCAA 

M V»M 1 1 1 v V1MM 


4510 


4520 


4530 




CAATArrrTC 

V-MM 1 MV>V»V» 1 VI 


ATTGAfTGGA 

Mil VIMV. 1 VI VIM 


4570 


4580 


4590 


TCrTCACCCG 

1 1 VIM V> VI V> VI 


1 V. V. VI VI V. VIM 1 1 


1 V* V.UV. V- 1 V. 1 VI 


4630 


4640 


4650 


^ Vj A 1 AA 1 ii 1 


AAAAAW.ViV.viV. 


Vaii 1 ViAv.UA I ia 


4690 


4700 


4710 


AAAlAiiAAVj 1 


VaAi.AVaV,ViV. 1 A 


V.VIV.VI 1 w\V,v>v. 


4750 


4760 


4770 


TCACCCkC CA 

1 VIMtlVVlM V»V|M 


ATTAAAAf AG 
Ml lAAAAvAVa 


viv.uA 1 viAV.V.Va 


4810 


4820 


4830 


CCGCGCA. CAT 


VlV. 1 AV.V>VfV»V. 1 


Vi 1 AiiA 1 if 1 via 


4870 


4880 


4890 


tta rcrizrcr 

1 1 Av.VaV.va 1 V.W 


V. v> 1 V. 1 III I V. 1 


it I V.V1V1 t v, 1 la I 


4930 


4940 


4950 


CAACGGTGCT 


GATGCTGGCG 


ACGCCGGCGC 


4990 


5000 


5010 


GCTCGCCGCC 


GCCCATCGCT 


GATGAAATCC 


5050 


5060 


5070 


AAATCTTTAA 


CGTCGGCGGC 


GCGCAGGCGA 


5110 


5120 


5130 


TACCGAAAGT 


GGATAAAATT 


TTTGGCCCCG 


5170 


5180 


5190 


AGGTCAGCCA 


GCGTCTCGAC 


GGCGCGGCTA 



3940 3950 3960 

GAACAACACT TAAAATCGCA GTATCCGGAA 

4000 4010 4020 

CATGCGAGAA TCTGACGCAG GCAGTTCTAT 

4060 4070 4080 

TAGATCCAGA GGAATTGTTT TGTCACGATC 

4120 4130 4140 

AAAACCGGGA GGTAGATGAG ATGTGACGAA 

4180 4190 4200 

CGTTTTAGAA TCCATGATAA TAATTTTCTG 

4240 4250 4260 

AA I I I I I IGC AACCCCTTTT TGGAAACAAA 

4300 4310 4320 

GTTGAGCAAT TCACGTTCAT TATAAATGTC 

4360 4370 4380 

TAACGCGCCC AACACCGGCA TAAAGAATTG 

4420 4430 4440 

GTGATTTGTA TTCAGCCCAT ATCGTTTCAT 

4480 4490 4500 

GTATTCCGCG TACAGCCCGG CCGTTTAAAC 

4540 4550 4560 

ACAGCTGTAG CCCTGAACAG CAGCGTGCGC 

4600 4610 4620 

ACAGTATTAC CCGGACGGTC AGCGATATTC 

4660 4670 4680 

CCCTGCGTGA ATACAGCGCT AAATTTGATA 

4720 4730 4740 

CTGAAGAGAT CGCCGCCGCC GGCGCGCGTC 

4780 4790 4800 

CTGCCGTCAA AAATATTGAA ACGTTCCATT 

4840 4850 4860 

AAACCCAGCC AGGCGTGCGT TGCCAGCAGG 

4900 4910 4920 

ATATTCCCGG CGGCTCG6CT CCGCTCTTCT 

4960 4970 4980 

GCATTGCGGG ATGCCAGAAG GTGGTTCTGT 

5020 5030 5040 

TCTATGCGGC GCAACTGTGT GGCGTGCAGG 

5080 5090 5100 

TTGCCGCTCT GGCCTTCGGC AGCGAGTCCG 

5140 5150 5160 

GCAACGCCTT TGTAACCGAA GCCAAACGTC 

5200 5210 5220 

TCGATATGCC AGCCGGGCCG TCTGAAGTAC 
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5230 5240 5250 5260 5278 - 5280 

TGGTGATCGC AGACAGCGGC GCAACACCGG AT7TCGTCGC TTCTGACCTG CTCTCCCAGG 

5290 5300 5310 5320 5330 5340 

CTGAGCACGG CCCGGATTCC CAGGTGATCC TGCTGACGCC TGATGCTGAC ATTGCCCGCA 

5350 5360 5370 5380 5390 5400 

AGGTGGCGGA GGCGGTAGAA CGTCAACTGG CGGAACTGCC GCGCGCGGAC ACCGCCCGGC 

5410 5420 5430 5440 5450 5460 

AGGCCCTGAG CGCCAGTCGT CTGATTGTGA CCAAAGATTT AGCGCAGTGC GTCGCCATCT 

5470 5480 5490 5500 5510 5520 

CTAATCAGTA TGGGCCGGAA CACTTAATCA TCCAGACGCG CAATGCGCGC GATTTGGTGG 

5530 5540 5550 5560 5570 5580 

ATGCGATTAC CAGCGCAGGC TCGGTATTTC TCGGCGACTG GTCGCCGCAA TCCGCCGGTG 

5590 5600 5610 5620 5630 5640 

ATTACGCTTC CGGAACCAAC CATGTTTTAC CGACCTATGG CTATACTCCT ACCTGTTCCA 

5650 5660 5670 5680 5690 5700 

GCCTTGGGTT AGCGGATTTC CAGAAACGGA TGACCGTTCA GGAACTGTCG AAAGCGGGCT 

5710 5720 5730 5740 5750 5760 

TTTCCGCTCT GGCATCAACC ATTGAAACAT TGGCGGCGGC AGAACGTCTG ACCGCCCATA 

5770 5780 5790 5800 5810 5820 

AAAATGCCGT GACCCTGCGC GTAAACGCCC TCAAGGAGCA AGCATGAGCA CTGAAAACAC 

5830 5840 5850 5860 5870 5880 

TCTCAGCGTC GCTGACTTAG CCCGTGAAAA TGTCCGCAAC CTGGAGATCC AGACATGGAT 

5890 5900 5910 5920 5930 5940 

MGATACATT GATGAGTTTG GACAAACCAC AACTAGAATG CAGTGAAAAA AATGCTTTAT 

5950 5960 5970 5980 5990 6000 

TTGTGAAATT TGTGATGCTA TTGCTTTATT TGTAACCATT ATAAGCTGCA ATAAACAAGT 

6010 6020 6030 6040 6050 6060 

TAACAACAAC AATTGCATTC ATTTTATGTT TCAGGTTCAG GGGGAGGTGT GGGAGGTTTT 

6070 6080 6090 6100 6110 6120 

TTAAAGCAAG TAAAACCTCT ACAAATGTGG TATGGCTGAT TATGATCTCT AGGGCCGGCC 

6130 6140 6150 6160 6170 6180 

CTCGACGGCG CGTCTAGAGC AGTGTGGTTT TCAAGAGGAA GCAAAAAGCC TCTCCACCCA 

61M 6200 6210 6220 6230 6240 

GGCCTGGAAT GTTTCCACCC AATGTCGAGC AGTGTGGTTT TGCAAGAGGA AGCAAAAAGC 

6250 6260 6270 6280 6290 6300 

CTCTCCACCC AGGCCTGGAA TGTTTCCACC CAATGTCGAG CAAACCCCGC CCA6CGTCTT 

6310 6320 6330 6340 6350 6360 

GTCATTGGCG AATTCGAACA CGCATATGCA GTCGGGGCGG CGCGGTCCa GGTCCACTTC 

6370 6380 6390 6400 6410 6420 

GCATATTAAG GTGGCGCGTG TG6CCTCGAA CACCGAGCGA CCCTGCAGCC AATATGGGAT 

6430 6440 6450 6460 6470 6480 

CGCCCATTGA ACAAGATGGA TTGCACGCAG GTTCTCCGGC CGCTTGGGTG GAGAGGCTAT 

6490 6500 6510 6520 6530 6540 
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TCGGCTATGA CTGGGCACAA CAGACAATCG 

6550 6560 6570 

CAGCGCAGGG GCGCCCGGTT CTTTTTGTCA 

6610 6620 6630 

TGCAGGTAAG TGCGGCCGTC GATGGCCGAG 

6670 6680 6690 

ATTAGTCAGC CATGCATGGG GCGGAGAATG 

6730 6740 6750 

GGGCGGAGTT AGGGGCGGGA CTATGGTTGC 

6790 6800 68X0 

TCTGCCTGCT GGGGAGCCTG GGGACTTTCC 

6850 6860 6870 

^CTTTGCATA CTTCTGCCTG CTGGGGAGCC 

6910 6920 6930 

ATTCCACAGA ATTAATTCCC CTAG7TATTA 

6970 6980 6990 

TAGCCCATAT ATGGAGTTCC GCGTTACATA 

7030 7040 7050 

GCCCAACGAC CCCCGCCCAT TGACGTCAAT 

7090 7100 7110 

AGGGACTTTC CATTGACCTC AATGGGTGGA 

7150 7160 7170 

ACATCAAGTG TATCATATGC CAAGTACGCC 

7210 7220 7230 

r/; CCTGGCAT TATGCCCAGT ACATGACCTT 

7270 7280 7290 

CGTATTAGTC ATCGCTATTA CCATGGTGAT 

7330 7340 7350 

ATAGCGGTTT GACTCACGGG GATTTCCAAG 

7390 7400 7410 

GTTTTGGCAC CAAAATCAAC GGGACTTTCC 

7450 7460 7470 

GCAAATGGGC GGTAGGCGTG TACGGTGGGA 

7510 7520 7530 

CCGTCAGATC GCCTGGAGAC GCCATCACAG 

7570 7580 7590 

TCAGCTCCTG GGGCTCCTTC TGCTCTGGCT 

7630 7640 7650 

CCAGTCTCCA TCTTCCCTGT CTGCATCTGT 

7690 7700 7710 

AAGTCAGGAC ATTAGGTATT ATTTAAATTG 

7750 7760 7770 

GCTCCTGATC TATGTTGCAT CCAGTTTGCA 



GCTGCTCTGA 


TGCCGCCGTG 


TTCCGGCTGT 


6580 


6590 


6600 


AGACCGACCT 


GTCCGGTGCC 


CTGAATGAAC 


6640 


6650 


6660 


GCGGCCTCGG 


CCTCTGCATA 


AATAAAAAAA 


6700 


6710 


6720 


GGCGGAACTG 


GGCGGAGTTA 


GGGGCGGGAT 


6760 

W I WW 


6770 


6780 

w t o w 


TGACTAATTG 


AGATGCATGC 


TTTGCATACT 


6820 


6830 


6840 


ACACCTGGTT 


GCTGACTAAT 


TGAGATGCAT 


6880 
woo v 


6890 

ww^ w 


6900 

W7WW 


TGGGGACTTT 


CCACACCCTA 


ACTGACACAC 


6940 


6950 


6960 

W WW 


ATAGTAATCA 


ATTACGGGGT 


CATTAGTTCA 


7000 

I www 


7010 


7020 


ACTTACGGTA 


AATGGCCCGC 


CTGGCTGACC 


7060 
f www 


7070 


7080 


AATGACGTAT 


GTTCCCATAG 


TAACGCCAAT 


7120 


7130 


7140 


GTATTTACGG 


TAAACTGCCC 


ACTTGGCAGT 


/ lav 


7190 


7200 


CCCTATTGAC 


GTCAATGACG 


GTAAATGGCC 


7240 


7250 


7260 


ATGGGACTTT 


CCTACTTGGC 


AGTACATCTA 




7310 


7320 


GCGGTTTTGG 


CAGTACATCA 


ATGGGCGTGG 




7370 


7380 


TCTCCACCCC 


ATTGACCTCA 


ATGGGAGTTT 


7470 


7430 


7440 


AAAATGTCGT 


AACAACTCCG 


CCCCATTGAC 


7480 


7490 


7500 


GGTCTATATA 


AGCAGAGCTG 


GGTACGTGAA 






7SfiO 

f J«V 


A JXTCTCACL 






7600 


7610 


7620 


CCCAGGTGCC 


AGATGTGACA 


TCCAGATGAC 


7660 


7670 


7680 


AGGGGACAGA 


GTCACCATCA 


CTTGCAGGGC 


7720 


7730 


7740 


CTATCAGCAG 


AAACCAGGAA 


AAGCTCCTAA 


7780 


7790 


7800 


AAGTGGGGTC 


CCATCAAGGT 


TCAGCGGCAG 
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7810 7820 7830 

TGGATCTGGG ACAGAGTTCA CTCTCACCGT 

7870 7880 7890 

TTATTACTGT CTACAGGTTT ATAGTACCCC 

7930 7940 79S0 

AATCAAACGT ACGGTGGCTG CACCATCTGT 

7990 8000 8010 

GAAATCTGGA ACTGCCTCTG TTGTGTGCCT 

8050 8060 8070 

AGTACAGTGG AAGGTGGATA ACGCCCTCCA 

8110 8120 8130 

GCAGGACAGC AAGGACAGCA CCTACAGCCT 

8170 8180 8190 

r ACGAGAAA CACAAAGTCT ACGCCTGC6A 

8230 8240 8250 

CACAAAGAGC TTCAACAGGG GAGAGTGTTG 

8290 8300 8310 

CTAGACTGGA TTCGTGACAA CATGCGGCCG 

8350 8360 8370 

GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG 

8410 8420 8430 

AGGTGCCACT CCCACTGTCC TTTCCTAATA 

8470 8480 8490 

TAGGTGTCAT TCTATTCTGC GGGGTGCGGT 

8530 8540 8550 

ACAATAGC AGGCATGCTG GGGATGCGGT 

8590 8600 8610 

CAGCTGGGAC TAGTCGCAAT TGGGCCGAGT 

8650 8660 8670 

GACTATGGTT GCTGACTAAT TGAGATGCAT 

8710 8720 8730 

TGGGGACTTT CCACACCTGG TTGCTGACTA 

8770 8780 8790 

TGCTGGGGAG CCTGGGGACT TTCCACACCC 

8830 8840 8850 

CCCTAGTTAT TAATAGTAAT CAATTACGGG 

8890 8900 8910 

CCGCGTTACA TAACTTACCG TAAATGGCCC 

8950 8960 8970 

ATTGACGTCA ATAATGACGT ATGTTCCCAT 

9010 9020 9030 

TCAATGGGTG GAGTATTTAC GGTAAACTGC 

9070 9080 9090 

GCCAAGTACG CCCCCTATTG ACGTCAATGA 



7840 


7850 


7860 


CAGCAGCCTG 


CAGCCTGAAG 


ATTTTGCCAC 


7900 


7910 


7920 


TCGGACGTTC 


GGCCAAGGGA 


CCAAGGTGGA 


7960 


7970 


7980 


CTTCATCTTC 


CCGCCATCTG 


ATGAGCAGTT 


8020 


8030 


8040 


GCTGAATAAC 


TTCTATCCCA 


GAGAGGCCAA 


8080 


8090 


8100 


ATCGGGTAAC 


TCCCAGGAGA 


GTGTCACAGA 


8140 


8150 


8160 


CAGCAGCACC 


CTGACGCTGA 


GCAAAGCAGA 


8200 


8210 


8220 


AGTCACCCAT 


CAGGGCCTGA 


GCTCGCCCGT 


8260 


8270 


8280 


AATTCAGATC 


CGTTAACGGT 


TACCAACTAC 


8320 


8330 


8340 


TGATATCTAC 


GTATGATCAG 


CCTCGACTGT 


8380 


8390 


8400 


CCCCTCCCCC 


GTGCCTTCCT 


TGACCCTGGA 


8440 


8450 


8460 


AAATGAGGAA 


ATTGCATCGC 


ATTGTCTGAG 


8500 


8510 


8520 


GGGGCAGGAC 


AGCAAGGGGG 


AGGATTGGGA 


8560 


8570 


8580 


GGGCTCTATG 


GCTTCTGAGG 


CGGAAAGAAC 


8620 


8630 


8640 


TAGGGGCGGG 


ATGGGCGGAG 


TTAGGGGCGG 


8680 


8690 


8700 


GCTTT6CATA 


CTTCTGCCTG 


CTGGGGAGCC 


8740 


8750 


8760 


ATTGAGATGC 


ATGCTTTGCA 


TACTTCTGCC 


8800 


8810 


8820 


TAACTGACAC 


ACATTCCACA 


GAATTAATTC 


XJtfiA 
OaOO 


8870 


8880 




tata err CAT 


ATATGGAGTT 


8920 


8930 


8940 


GCCTGGCTCA 


CCGCCCAACG 


ACCCCCGCCC 


8980 


8990 


9000 


AGTAACGCCA 


ATAGGGACTT 


TCCATTGACG 


9040 


9050 


9060 


CCACTTCGCA 


GTACATCAAG 


TGTATCATAT 


9100 


9110 


9120 


CGGTAAATGG 


CCCGCCTGGC 


ATTATGCCCA 
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9130 9140 91S0 9160 9170 _ 9180 

GTACATCACC TTATGGGACT TTCCTACTTG GCAGTACATC TACGTATTAG TCATCGCTGT 

9190 9200 9210 9ZZ0 9230 9240 

TACCATGGTG ATGCGGTTTT GGCAGTACAT CAATGGGCGT GGATAGCGGT TTGACTCACG 

9250 9260 9270 9280 9290 9300 

GGGATTTCCA AGTCTCCACC CCATTGACGT CAATGGGAGT TTGTTTTCGC ACCAAAATCA 

9310 9320 9330 9340 9350 9360 

ACGGGACTTT CCAAAATGTC GTAACAACTC CGCCCCATTG ACGCAAATGG GCGCTAGGCG 

9370 9380 9390 9400 9410 9420 

TGTACGGTGG GAGGTCTATA TAAGCAGAGC TGGGTACGTG AACCGTCAGA TCGCCTGGAG 

9430 9440 9450 9460 9470 9480 

ACGCCGTCGA CATGGGTTGG AGCCTCATCT TGCTCTTCCT TGTCGCTGTT GCTACGCGTG 

9490 9500 9510 9520 9530 9540 

, CCTGTCCCA GGTGCAGCTG GTGGAGTCTG GGGGCGGCTT GGCAAAGCCT GGGGGGTCCC 

9550 9560 9570 9580 9590 9600 

TGAGACTCTC CTGCGCAGCC TCCGGGTTCA GGTTCACCTT CAATAACTAC TACATGGACT 

9610 9620 9630 9640 9650 9660 

GGGTCCGCCA GGCTCCAGGG CAGGGGCTGG AGTGGGTCTC ACGTATTAGT AGTAGTGGTG 

9670 9680 9690 9700 9710 9720 

ATCCCACATG GTACGaGAC TCCGTGAAGG GCAGATTCAC CATCTCCAGA GAGAACGCCA 

9730 9740 9750 9760 9770 9780 

AGAACACACT GTTTCTTCAA ATGAACAGCC TGAGAGCTGA GGACACGGCT GTCTATTACT 

9790 9800 9810 9820 9830 9840 

GTGCGAGCTT GACTACAGGG TCTGACTCCT GGGGCCAGGG AGTCCTGGTC ACCGTCTCCT 

9850 9860 9870 9880 9890 9900 

lAGCTAGCAC CAAGGGCCCA TCGGTCTTCC CCCTCGCACC CTCCTCCAAG AGCACCTCTG 

9910 9920 9930 9940 9950 9960 

GGCGCACAGC GGCCCTGGGC TGCCTGGTCA AG6ACTACTT CCCCGAACCG GTGACGGTGT 

9970 9980 9990 10000 10010 10620 

CGTG6AACTC AGGCGCCCTG ACCAGCGGCG TGCACACCTT CCCGGCTGTC CTACAGTCCT 

10030 10040 10050 10060 10070 10080 

CAGGACTCTA CTCCCTCAGC AGCGTGGTGA CCGTGCCCTC CAGCAGCTTG GGCACCCAGA 

10090 10100 10110 10120 10130 10140 

CCTACATCTG CAACGTGAAT CACAAGCCCA GCAACACCAA GGTGGACAAG AAAGTTGAGC 

10150 10160 10170 10180 10190 10200 

CCAAATCTTG TGACAAAACT CACACATGCC CACCGTGCCC AGCACCTGAA CTCCTGGGGG 

10210 10220 10230 10240 10250 10260 

GACCGTCAGT CTTCCTCTTC CCCCCAAAAC CCAAGGACAC CCTCAT6ATC TCCCG6ACCC 

10270 10280 10290 10300 10310 10320 

CT6AGGTCAC ATGCGTGGTG GTGGACGTGA GCCACGAAGA CCCTGAGGTC AAGTTCAACT 

10330 10340 10350 10360 10370 10380 

GGTACGTGGA CGGCGTGGAG GTGCATAATG CCAAGACAAA GCCGCGGGAG GAGCACTACA 

10390 10400 10410 10420 10430 10440 
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ACAGCACGTA CCGTGTGGTC AGCGTCCTCA CCGTCCTGCA CCAGGACTGG CTGAATGGCA 

10450 10460 10470 10480 10490 10500 

ACCAGTACAA GTGCAAGCTC TCCAACAAAG CCCTCCCAGC CCCCATCGAG AAAACCATCT 

10510 10520 10530 10540 10550 10560 

CCAAAGCCAA AGGGCAGCCC CGAGAACCAC AGGTGTACAC CCTGCCCCCA TCCCGGGATG 

10570 10580 10590 10600 10610 10620 

AGCTGACCAA GAACCAGGTC AGCCTGACCT GCCTGGTCAA AGGCTTCTAT CCCACCCACA 

10630 10640 10650 10660 10670 10680 

TCGCCGTGGA GTGGCACAGC AATGGGCAGC CGGAGAACAA CTACAAGACC ACGCCTCCCG 

10690 10700 10710 10720 10730 10740 

TGCT6GACTC CGACGGCTCC TTCTTCCTCT ACAGCAAGCT CACCGTGGAC AAGAGCAGGT 

10750 10760 10770 10780 10790 10800 

*GCAGCAGGG GAACGTCTTC TCATGCTCCG TCATGCATGA GGCTCTGCAC AACCACTACA 

10810 10820 10830 10840 10850 10860 

CGCAGAAGAG CCTCTCCCTG TCTCCGGGTA AATGAGGATC CGTTAACGGT TACCAACTAC 

10870 10880 10890 10900 10910 10920 

CTAGACTGGA TTCGTGACAA CATGCGGCCG TGATATCTAC GTATGATCAG CCTCGACTGT 

10930 10940 10950 10960 10970 10980 

GCCTTCTAGT TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT TGACCCTGGA 

10990 U0e« ueio 11020 11030 1104* 

AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA ATTGCATCGC ATTGTCTGAG 

11050 1X060 11070 11080 11090 1U00 

TAGGTCrTCAT TCTATTCTGG GGGGTGGGGT GGGGCAGGAC AGCAAGGGGG AGGATTGGGA 

UU0 11120 11130 11140 11150 11160 

^ACAATAGC AGGCATGCTG GGGATGCGGT GGGCTCTATG GCTTCTGAGG CGGAAAGAAC 

11170 1U80 U190 U200 11210 11220 

CAGCTGGGGC TCGACAGCAA CGCTAGGTCG AGGCCGCTAC TAACTCTCTC CTCCCTCCTT 

11230 11240 11250 11260 11270 11280 

TTTCCTGCAG GACGAGGCAG CGCGGCTATC GTGGCTGGCC ACGACGGGCG TTCCTTGCGC 

. ^J^ 239 11300 11310 11320 11330 11340 

AGCTGTGCTC GACGTTGTCA CTGAAGCGGG AAGGGACTGG CTGCTATTGG GCGAAGTGCC 

U350 11360 U370 11380 11390 11480 

GGGGCAGGAT CTCCTGTCAT CTCACCTTGC TCCTGCCCAG AAAGTATCCA TCATGGCTGA 

11410 11420 U430 11440 U450 11460 

TGCAATGCGG CGGCTGCATA CGCTTGATCC GGCTACCTGC CCATTC6ACC ACCAAGCGAA 

11470 U4S0 11490 11500 11510 11520 

ACATCGCATC GAGCGAGCAC GTACTCGGAT GGAAGCCGGT CTTGTCGATC AGGATGATCT 

^30 11540 U550 11560 11570 11S80 

GGACGAAGAG CATCAGGGGC TCGCGCCAGC CGAACTGTTC GCCAGCTAAG TGACCTCCAA 

«-~..H522 11600 11610 11620 11630 11640 

TTCAAGCTCT CGAGCTACG6 CGGCCAGCTA GTAGCTTTGC TTCTCAATTT CTTATTTGCA 

TI4 JJ5?? ^660 11670 U680 11690 11700 

TAATGAGAAA AAAAGGAAAA TTAATTTTAA CACCAATTCA GTA GTTGATT GAGCAAATGC 
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11710 11720 11730 11740 11750 11760 

GTTGCCAAAA AGGATGCTTT AGAGACAGTG TTCTCTGCAC AGATAAGGAC AAACATTATT 

11770 11780 11790 11800 U810 11820 

CAGAGCGAGT ACCCAGAGCT GAGACTCCTA AGCCAGTGAG TGGCACAGCA TCCAGGGAGA 

11830 11840 11850 11860 11870 11880 

AATATGCTTG TCATCACCGA AGCCTGATTC CGTAGAGCCA CACCCTGGTA AGGGCCAATC 

11890 11900 11910 11920 11930 11940 

TGCTCACACA GGATAGAGAG GGCAGGAGCC AGGGCAGAGC ATATAAGGTG AGGTAGGATC 

11950 11960 11970 11980 11990 12000 

AGTTGCTCCT CACATTTGCT TCTGACATAG TTGTGTTGGG AGCTTGGATA GCTTGGGC6G 

12010 12020 1203O 12040 12050 12060 

GGGACAGCTC AGGCCTGC6A TTTCGCGCa AACTTGACGG CAATCCTAGC GTGAAGGCTG 

1 2070 12080 12090 12100 12110 12120 

'AGGATTTT ATCCCCGCTG CCATCATGGT TCGACCATTG AACTGCATCG TCGCCGTGTC 

12130 12140 12150 12160 12170 12180 

.CCAAAATATG GGGATTGGCA AGAACGGAGA CCTACCCTGG CCTCCGCTa GGAACGAGTT 

12190 12200 12210 12220 12230 12240 

CAAGTACTTC CAAAGAATGA CCACAACCTC TTCAGTGGAA GGTAAACAGA ATCTGGTGAT 

12250 12260 12270 12280 12290 12300 

TATGGGTAGG AAAACCTGGT TCTCCATTCC TGAGAAGAAT CGACCTTTAA AGGACAGAAT 

12310 12320 12330 12340 12350 12360 

TAATATAGTT CTCAGTAGAG AACTCAAAGA ACCACCACGA GGAGCTCATT TTCTTGCCAA 

12370 12380 12399 12400 12410 1242© 

AAGTTTGGAT GATGCCTTAA CGTAGGCGCG CCATTAAGAC TTATTGAACA ACCGGAATTG 

12430 12440 12450 12460 12470 12480 

1AAGTAAAG TAGACATGGT TTGGATAGTC GGAGGCAGTT CTGTTTACCA GGAAGCCATG 

12490 12500 12510 12520 12530 12540 

AATCAACCAG GCCACCTCAG ACTCTTTGTG ACAAGGATCA TGCAGGAATT TGAAAGT6AC 

1255 0 12560 12570 12580 12590 12600 

ACGTTTTTCC CAGAAATTCA TTTGGGGAAA TATAAACTTC TCCCAGAATA CCCAGGCGTC 

12610 12620 12630 12640 12650 12660 

CTCTCTGAGG TCCAGGAGGA AAAAGGCATC AAGTATAAGT TTGAAGTCTA CGAGAAGAAA 

12670 12680 12690 12700 12710 12720 

GACTAACA6C AAGATGCTTT CAAG7TCTCT GCTCCCCTCC TAAAGCTATG CATTTTTATA 

12730 12740 12750 12760 12770 12780 

ACACCATG6G ACTTTTGCTG GCTTTAGATC AGCCTCGACT GTGCCTTCTA GTTGCCACCC 

12790 12800 12810 12820 12830 12840 

ATCTGTTGTT TGCCCCTCCC CCGTGCCTTC CTTGACCCTG GAAGGTGCCA CTCCCACTGT 

12850 12860 12870 12880 12890 12900 

CCTTTCCTAA TAAAATGAGG AAATTGCATC GCATTGTCTG AGTAGGTGTC ATTCTATTCT 

12910 12920 12930 12940 12950 12960 

GGGGGGTGGG GTGGGGCAGG ACAGCAAGGG GGAGGATTGG GAAGACAATA GCAGGCATGC 

12970 12980 12990 13000 13010 13020 

TGGGGATGCG GTGCGCTCTA TGCCTTCTGA GGCGGAAAGA ACCAGCTGCG GCTCGAAGCG 
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13030 13040 13050 13060 13070 13080 

GCCGCCCATT TCGCTGGTCG TCAGATGCGG CATGGCGTGG GACGCGGCGG GGAGCGTCAC 

13090 13100 13110 13120 13130 13140 

ACTGAGGTTT TCCGCCAGAC GCCACTGCTG CCAGGCGCTG ATGTGCCCGG CTTCTGACCA 

13150 13160 13170 13180 13190 13200 

TGCGGTCGCG TTCGGTTGCA CTACGCGTAC TGTGAGCCAG AGTTGCCCGG CGCTCTCCGG 

13210 13220 13230 13240 13250 13260 

CTGCGGTAGT TCAGGCAGTT CAATCAACTG TTTACCTTGT GGAGCGACAT CCAGAGGCAC 

13270 13280 13290 13300 13310 13320 

TTCACCGCTT GCCAGCGGCT TACCATCCAG CGCCACCATC CAGTGCAGGA GCTCCTTATC 

13330 13340 13350 13360 13370 13380 

GCTATGACGG AACAGGTATT CGCTGGTCAC TTC6ATGGTT TGCCCCGATA AACGGAACTG 

13390 13400 13410 13420 13430 13440 

w^VAAACTGC TGCTGGTGTT TTGCTTCCGT CAGCGCTGGA TGC6GCGTGC GGTCGGCAAA 

13450 13460 13470 134S0 13490 13500 

GACCAGACCG TTCATACAGA ACTGGCGATC GTTCGGCGTA TCGCCAAAAT CACCGCCGTA 

13510 13520 13530 13540 13550 13560 

AGCCGACCAC GGGTTGCCGT TTTCATCATA TTTAATCAGC GACTGATCCA CCCACTCCa 

13570 13580 13590 13600 13610 13620 

GACGAAGCCG CCCTGTAAAC GGGGATACTG ACGAAACGCC TGCCAGTATT TAGCGAAACC 

13630 13640 13650 13660 13670 13680 

GCCAAGACTG TTACCCATCG CGTGGGCGTA TTCGCAAAGG ATCAGCGGGC GC6TCTCTCC 

13690 13760 13710 13720 13730 13740 

AGCTAGC6AA ACCCATTTTT TGATGGACCA TTTCGGCACA GCCGGGAAGG GCTGGTCTTC 

1375e 13760 13770 13786 13790 13800 

mTCCACCCGC GCGTAaTCG GGCAAATAAT ATCGGTGGCC GTGGTGTCGG CTCCGCCGCC 

13810 13820 13830 13840 13850 13860 

TTCATACTGC ACCGGGCGCG AAGGATCGAC AGATTTGATC CAGCCATACA GCGCGTCGTG 

13870 13880 13890 13900 13910 13920 

ATTAGCGCCG TGGCCTGATT CATTCCCCAG CGACCAGATG ATCACACTCG GGTGATTACG 

13930 13940 13950 13960 13970 13980 

ATCGCGCTGC ACCATTCGCG TTACGCGTTC GCTCATCGCC GGTAGCCAGC GCGGATCATC 

13990 14000 14010 14020 14030 14040 

GGTCAGACGA TTCATTGCCA CCATGCCGTG GGTTTCAATA TTGGCTTCAT CCACCACATA 

14050 14060 14070 14080 14090 14100 

CACCCCGTAG CGGTCGCACA GCGTGTACCA CAGCGGATGG TTCGGATAAT GCGAACA6CG 

14U0 14120 14130 14140 14150 14160 

CACGGCGTTA AACTTGTTCT GCTTCATCAG CAGGATATCC TGCACCATCG TCTGCTCATC 

14170 14180 14190 14200 14210 14220 

CATCACCTGA CCATGCAGAG GATGATGCTC GTGACGGTTA ACGCCTCGAA TCAGCAACGG 

14230 14240 14250 14260 14270 14280 

CTTGCCGTTC AGCAGCAGCA GACCATTTTC AATCCGCACC TCGCGCAAAC CGACATCGCA 
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GGCTTCTGCT TCAATCAGCG TGCCGTCGGC GGTGTGCAGT TCAACCACCG CACGATAGAG 

14350 14360 14370 14380 14390 14400 

ATTCGGGATT TCGGCGCTCC ACAGTTTCGG GTTTTCGACG TTCAGACGTA GTGT6ACGCG 

14410 14420 14430 14440 14450 14460 

ATCGGCATAA CCACCACGCT CATCGATAAT TTCACCGCCG AAAGGCGCGG TGCCGCTGGC 

14470 144*0 14490 14500 14510 145Z0 

GACCTGCGTT TCACCCTGCC ATAAAGAAAC TGTTACCCGT AGGTAGTCAC GCAACTCGCC 

14530 14540 14550 14560 14570 14580 

GCACATCTGA ACTTCAGCCT CCAGTACAGC GCGGCTGAAA TCATCATTAA ACCCAGTGGC 

14590 14600 14610 14620 14630 14640 

AACATGGAAA TCGCTGATTT GTGTAGTCGG TTTATGCAGC AACGAGACGT CACGGAAAAT 

14650 14660 14670 14680 14690 14700 

"CGCTCATC CGCCACATAT CCT6ATCTTC CAGATAACTC CCGTCACTCC AGCGCAGCAC 

14710 14720 14730 14740 14750 14760 

CATCACCGCG AGGCGGTTTT CTCCGGCGCG TAAAAATGCG CTCAGGTCAA ATTCAGACGG 

14770 14780 14790 14800 14810 14820 

CAAACGACTG TCCTGGCCGT AACCGACCCA GCGCCCGTTG CACCACAGAT GAAACGCCGA 

14830 14840 14850 14860 14870 14880 

GTTAACGCCA TCAAAAATAA TTCGCGTCTG GCCTTCCTGT AGCCAGCTTT CATCAACATT 

14890 14960 14910 14920 14930 14940 

AAATGTGAGC GAGTAACAAC CCGTCGGATT CTCCGTGGGA ACAAACGGCG GATTGACCGT 

14950 14960 14970 14980 14990 15900 

AATGGGATAG GTCACGTTGG TGTA6ATCGG CGCATCGTAA CCGTGCATCT GCCACTTTGA 

15010 15020 15030 15040 15050 15060 

— G6ACGACG ACAGTATCGG CCTCAGGAAG ATCGCACTCC AGCCAGCTTT CCGGaCCGC 

15070 15080 15090 15100 15110 15120 

TTCTGGTGCC GGAAACCAGG CAAA6CGCCA TTCGCCATTC AGGCTGCGCA ACTGTTGGGA 

15130 15140 15150 15160 15170 15180 

AGGGCGATCG GTGCG6GCCT CTTCGCTA7T ACGCCAGCTG GCGAAAGGGG GATGTGCTGC 

15190 15200 15210 15220 15230 15240 

AAGGCCATTA AGTTGGGTAA CGCCAGGGTT TTCCCAGTCA CGACGTTGTA AAACGACTTA 

15250 15260 15270 15280 15290 15300 

ATCCGTCGAG GGGCTGCCTC GAAGCAGACG ACCTTCCGTT GTGCAGCCAG CGGCGCCTGC 

15310 15320 15330 15340 15350 15360 

GCCGGTGCCC ACAATCGTCC GCGAACAAAC TAAACCAGAA CAAATTATAC CGGCGGCACC 

15370 15380 15390 15400 15410 15420 

GCCGCCACCA CCTTCTCCCG TGCCTAACAT TCCAGCCCCT CCACCACCAC CACCACCATC 

15430 15440 15450 15460 15470 15480 

GATCTCTCAA TTCCCGCCCG CTCCACCAAT GCCGACGGAA CCTCAACCCG CTGCACCTTT 

15490 15500 15510 15520 15530 15540 

AGACGACAGA CAACAATTGT TGGAAGCTAT TAGAAACGAA AAAAATCGCA CTCGTCTCAG 

15550 15560 15570 15580 15590 15600 

ACCGGTCAAA CCAAAAACGG CGCCCGAAAC CAGTACAATA GTTGA6GTGC CGACTGTGTT 
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15610 15620 15630 15640 15650 15W 

GCCTAAAGAG ACATTTGAGC CTAAACCGCC GTCTGCATCA CCGCaCUC CTCCGCCTCC 

15670 15680 15690 15700 15710 15720 

GCCTCCGCCG CCAGCCCCGC CTGCCCCTCC ACCGATGGTA GATTTATCAT CAGCTCCACC 

15730 15740 15750 15760 15770 15780 

ACCGCCGCCA TTACTAGATT TGCCGTCTGA AATGTTACCA CCGCCTGCAC CATCGCTTTC 

15790 15800 15810 15820 15830 15840 

TAACGTGTTG TCTGAATTAA AATCGGGCAC AGTTAGATTG AAACCCGCCC AAAAACGCCC 

15850 15860 15870 15880 15890 15*60 

GCAATCAGAA ATAATTCCAA AAAGCTCAAC TACAAATTTG ATCGCGGACG TGTTAGCC6A 

15910 15920 15930 15940 15950 _ 15*>0 

CACAATTAAT AGGCGTCGTG TG6CTATGGC AAAATCGTCT TCGGAAGCAA CTTCTAACGA 

15970 15980 15990 16000 16010 16020 

'AGGGTTGG GACGACGACG ATAATCGGCC TAATAAAGCT AACACGCCCG ATGTTAAATA 

16030 16040 16050 16060 16070 16080 

JGTCCAAGCT ACTAGTGGTA CCGCTTGGCA GAACATATCC ATC6CGTCCG CCATCTCCAG 

16090 16100 16110 16120 16130 _ 1*140 

CAGCCGCACG CGGCGCATCT CGGGCAGCGT TGGGTCCTGG CCACG6GTGC GCATGATCGT 

16150 16160 16170 16180 16190 16200 

GCTCCTGTCG TTGAGGACCC GGCTAGGCTG GCGGGGTTGC CTTACTGGTT AGCAGAATGA 

16210 16220 16230 16240 1625* 5? 

ATCACCGATA CGCGAGCGAA CGTGAAGCGA CTGCTGCTGC AAAACGTCTG CGACCTGAGC 

16270 16280 16290 16300 16310 4 

AACAACATGA ATGGTCTTCG GTTTCCCTGT TTCGTAAA6T CTGGAAACCC GGAAGTCAGC 

16330 16340 16350 16360 16370 _ 16380 

:CCT6CACC ATTATGTTCC GGATCTGCAT CGCAGGATGC TGCTG6CTAC CCTGTGGAAC 

16390 16400 16410 16420 16436 1^440 

ACCTACATCT GTATTAACGA AGCGCTCGCA TTGACCCTGA GTCATTTTTC TCTGGTCCCG 

16450 16460 16470 16480 16490 1«06 

CC6CATCCAT ACCGCCAGTT GTTTACCCTC ACAACGTTCC ACTAACCG6G CATGTTCATC 

16510 16520 16530 16540 16550 ^ 

ATCAGTAACC CGTATCGTGA GCATCCTCTC TCGTTTCATC GGTATCATTA CCCCCATGAA 

16570 16580 16590 16600 16610 16620 

CACAAATCCC CCTTACACGG ACGCATCAGT GACCAAACAG GAAAAAACCG CCCTTAACAT 

16630 16640 16650 16660 16670 lf««0 

GCCCCGCTTT ATCAGAAGCC AGACATTAAC GCTTCTGGA6 AAACTCAACG AGCTGGACGC 

16690 16700 16710 16720 16730 

GGATGAACAG GCAGACATCT GTGAATCGCT TCACGACCAC GCTGATGAGC TTTACCGCAG 

16750 16760 16770 16780 16790 IfW® 

CTGCCTCGC6 CGTTTCGGTG ATGACGGTGA AAACCTCTGA CACATGCAGC TCCCGGAGAC 

16810 16820 16830 16840 16850 16860 

GGTCACAGCT TGTCTCTAA6 CGGAT6CCGG GAGCACACAA GCCCGTCAGG GCGCGTCAGC 

16870 16880 16890 16900 16910 16920 

GGGTGTTGGC 6GGTGTCGGG GCGCAGCCAT GACCCAGTCA CGTAGCGATA GCGGAGTGTA 
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16930 16940 16959 16969 16979 16989 

TACTGGCTTA ACTATGCGGC ATCAGACCAG ATTGTACTGA GAGTCCACCA TATGCGGTGT 

16999 17999 17919 179Z9 17939 17949 

GAAATACCGC ACAGATGCGT AAGGAGAAAA TACCCCATCA G6CGCTCTTC CGCTTCCTCG 

17959 17969 17979 17989 17999 17199 

CTCACTGACT CGCTGCGCTC GGTCGTTCGG CTGCGGCGAG CGGTATCAGC TCACTCAAAG 

17110 17129 17139 17149 17159 17169 

GCCGTAATAC GGTTATCCAC AGAATCAGGG GATAACGCAG GAAAGAACAT GTGAGCAAAA 

17179 17189 17199 17299 17219 17229 

GGCCAGGAAA AGGCCAGGAA CCGTAAAAAG GCCGCGTTGC TGGCGTTTTT CCATAOTCTC 

17239 17249 17259 17269 17279 17289 

CGCCCCCCTG ACGAGCATCA CAAAAATCGA CGCTCAA6TC AGAGGTGGCG AAACCCGACA 

17299 17399 17319 17329 17339 17349 

GGACTATAAA GATACCAGGC GTTTCCCCCT GGAAGCTCCC TCGTGCGCTC TCCTGTTCCG 

17359 17369 17379 17389 17399 17499 

ACCCTGCCGC TTACCGGATA CCTCTCCGCC TTTCTCCCTT CGGGAAGCGT GGCGCTTTCT 

17419 17429 17439 17449 17459 17469 

CATAGCTCAC GCTGTAGGTA TCTCAGTTCG GTGTA6GTCG TTCGCTCCAA GCTGCGCTGT 

17479 174S9 17499 17599 17519 17S29 

GTGCACGAAC CCCCCGTTCA GCCCGACCGC TGCGCCTTAT CCGGTAACTA TCGTCTTGA6 

17539 17549 17559 17569 17579 17589 

TCCAACCCGG TAAGACACGA CTTATCGCCA CTGGCAGCAG CCACTG6TAA CAGGATTAGC 

17599 17699 17619 17629 17639 17649 

AGAGCGAGGT ATGTAGGCGG TGCTACAGAG TTGTTGAAGT GGTGGCCTAA CTACGGCTAC 

17659 17669 17679 17689 17699 17799 

ACTAGAAGGA CAGTATTTGG TATCTGCGCT CTGCTGAAGC CAGTTACCTT CGGAAAAAGA 

1771® 17729 17739 17749 17759 17769 

GTTCGTAGCT CTTGATCCGG CAAACAAACC ACCGCTGGTA GCGGTGGTTT TTTTGTTTGC 

17779 17789 17799 17899 17819 17829 

AAGCAGCAGA TTACGCGCAG AAAAAAAGGA TCTCAAGAAG ATCCTTTGAT CTTTTCTACG 

17839 17849 17859 17869 17879 17889 

GGCTCTGACG CTCAGTGGAA CGAAAACTCA CGTTAAGGGA TTTTGGTCAT GAGATTATCA 

17899 17999 17919 17929 17939 17949 

AAAAGGATCT TCACCTACAT CCTTTTAAAT TAAAAATGAA GTTTTAAATC AATCTAAAGT 

17959 17969 17979 17989 17999 18999 

A7ATATCAGT AAACTTGGTC TGACAGTTAC CAATGCTTAA TCAGTGAGGC ACCTATCTCA 

18919 18920 18939 18949 18959 18969 

GC6ATCTGTC TATTTCGTTC ATCCATACTT GCCT6ACTCC CCGTCGTGTA GATAACTACG 

18979 18989 18999 18199 18119 18129 

ATACGGGAGG GCTTACCATC TGGCCCCAGT GCTGCAATGA TACCGC6AGA CCCACGCTCA 

18139 18149 18150 18169 18170 18180 

CCGGCTCCAG ATTTATCAGC AATAAACCAG CCAGCCGGAA GGGCCGAGCG CAGAAGTGGT 

18199 18290 18219 18229 18230 18240 



WO 98/41645 PCT/US98/03935 

DNASIS 51 1 51 

Handy + SE8N-SHL 

CttCCAACTT TATCCGCCTC CATCCAGTCT ATTAATTGTT GCCGGGAAGC TAGAGTAAGT 

18250 18260 18270 18288 18290 18300 

AGTTCGCCAG TTAATAGTTT GCGCAACGTT GTTGCCATTG CTGCAGGCAT CGTGGTGTCA 

18310 18320 18330 18340 18350 18360 

CGCTCGTCGT TTGGTATGGC TTCATTCAGC TCCGCTTCCC AACGATCAAG GCGAGTTACA 

18370 18380 18390 18400 18410 18420 

TGATCCCCCA TGTTGTGCAA AAAACCCGTT AGCTCCTTCG GTCCTCCGAT CGTTGTCAGA 

18430 18440 18450 18460 18470 1M80 

AGTAAGTTGG CCGCACTGTT ATCACTCATG GTTATGGCAG CACTGCATAA TTCTCTTACT 

18490 18500 18510 18520 18530 18540 

GTCATGCCAT CCGTAAGATG CTTTTCTCTG ACTGGTGAGT ACTCAACCAA GTCATTCTGA 

18550 18560 18570 18580 18590 18600 

GAATAGTGTA TGCGGCGACC GAGTTGCTCT TGCCC6GCGT CAACACGGGA TAATACCGCG 

18610 18620 18630 18640 18650 18660 

cCACATAGCA GAACTTTAAA AGTGCTCATC ATTG6AAAAC GTTCTTCGGG GCGAAAACTC 

18670 18680 18690 18700 18710 18720 

TCAAGGATCT TACCGCTGTT GA6ATCCAGT TCGATGTAAC CCACTCGTGC ACCCAACTGA 

18730 18740 18750 18760 18770 18780 

TCTTCAGCAT CTTTTACTTT CACCACCGTT TCTGGGTGA6 CAAAAACAGG AAGGCAAAAT 

18790 18800 18810 18820 18830 18848 

GCCGCAAAAA AGGGAATAAG GGCGACACGG AAATCTTGAA TACTCATACT CTTCCTTTTT 

18850 18860 18870 18880 18890 18900 

CAATATTATT GAAGCATTTA TCAGGGTTAT TGTCTCATGA GCGGATACAT ATTTGAATGT 

189X0 18920 18930 18940 18950 18960 

ATTTAGAAAA ATAAACAAAT AGGGGTTCCG CGCACATTTC CCCGAAAAGT GCCACCTGAC 

18978 18980 18990 19008 19010 19020 

GTCTAAGAAA CCATTATTAT CATGACATTA ACCTATAAAA ATAGGCGTAT CACGA6GCCC 

19030 19040 19050 19060 19970 19080 
TTTCGTCTTC AAGAA 



INTERNATIONAL SEARCH REPORT 



li dtlonal Application No 

PCT/US 98/03935 



A. CLASSIFICATION OF SUBJECT MATTER 

C12N15/90 



IPC 6 



C12N15/13 
C12N15/62 



C12N15/85 
C07K16/28 
C07K19/00 



C12Q1/68 
C12N15/12 



C12N5/10 
C07K14/705 



C12N9/12 
G01N33/53 



According to international Patent Classification (IPC) or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classrtLcation system followed by classification symbols) 

IPC 6 C12N C12Q C07K G01N 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted dunng the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category J 


Citation of document, with indication, where appropriate, ol the relevant passages 


Relevant to claim No. 


A 


WO 94 11523 A (IDEC PHARMACEUTICALS 
CORPORATION (US); REFF MITCHELL E. (US)) 
26 May 1994 

cited in the application 
see abstract 

see page 9, line 21 - page 10, line 29 
see page 41, line 19 - page 42, line 19; 
figure 6 


I, 4-8, 

II, 12, 
25-29, 
31,32 


A 


US 5 464 764 A (CAPECCHI MARIO R. AND KIRK 
THOMAS R . ) 7 November 1995 
see abstract 

see column 13, line 32 - column 14, line 5 


1 


A 


WO 94 05784 A (UNITED STATES AMERICA 
REPRESENTED BY THE SECRETARY US DPT. 
AGRICULTURE) 17 March 1994 
see abstract 

-/-- 


1 



| X| Furtr »r documents are listed in the continuation of box C. 



|X I Patent family members are listed in annex. 



' Special categories of cited documents : 

"A" document defining the general state of the art which is not 
considered to be of particular relevance 

"E" earlier document but published on or after the international 
filing date 

L" document which may throw doubts on priority daim(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

"0" document referring to an oral disclosure, use. exhibition or 
other means 

P" document published prior to the international filing date but 
later than the priority date claimed 



T later document published after the International filing date 
or priority date and not in conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

"X" document of particular relevance; the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step when the document Is taken alone 

"Y" document ol particular relevance; the claimed invention 

cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

'&' document member of the same patent family 



Date of the actual completion of theinternational search 

23 July 1998 


Date ol mailing of the international search report 

05/08/1998 


Name and mailing address of the ISA 

European Patent Office, P B. 5818 Patentlaan 2 
NL • 2280 HV Rljswijk 
Tel. (+31-70) 340-2040. Tx. 31 651 epo m. 
Fax: (+31*70) 340-3016 


Authorized officer 

Macchla, G 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



l« .attonal Application No 

PCT/US 98/03935 



C.(Contlnuatlon) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * Citation ot document, with indication. where appropnate. ot the relevant passages 



Relevant to claim No. 



WO 93 24642 A (TSI CORPORATION (US)) 9 
December 1993 
see abstract 

BARNETT R.S. ET AL.: "Antibody production 

in Chinese hamster ovary cells using an 

impaired selectable marker" 

ACS SYMPOSIUM SERIES: ANTIBODY EXPRESSION 

AND ENGINEERING, 

vol . 604, 1995, 

pages 27-40, XP002072464 



Form PCT/ISA/210 (continuation ot second sn«M) (July 1992) 



page 2 of 2 



INTERNATIONAL SEARCH REPORT 



t. .ational Application No 

PCT/US 98/03935 



Patent document 
cited in search report 


Publication 
date 


Patent family 
members) 


Publication 
date 


WO 9411523 A 


26-05-1994 


AU 


682481 


B 


09-10-1997 






AU 


5613294 


A 


08-06-1994 






CA 


2149326 


A 


26-05-1994 






DE 


669986 


T 


10-10-1996 






EP 


0669986 


A 


06-09-1995 








2088838 


T 


01-10-1996 






JP 


8503138 


T 


0Q-04-1Q96 






US 


5648267 


A 


15-07-1997 






us 


5733779 


A 


31-03-1998 


US 5464764 A 


07-11-1995 


us 


5487992 


A 


30-01-1996 






us 


5627059 


A 


06-05-1997 






us 


5631153 


A 


20-05-1997 


WO 9405784 A 


17-03-1994 


AU 


4839493 


A 


29-03-1994 






MX 


9305183 


A 


31-05-1994 


WO 9324642 A 


09-12-1993 


AU 


4401993 


A 


30-12-1993 



Form PCT/ISA/210 (pat»nt fanriy armx) (July 1M2) 



